==> Building on infernape ==> Checking for remote environment... ==> Syncing package to remote host... sending incremental file list ./ .SRCINFO 700 63% 0.00kB/s 0:00:00 1,108 100% 398.44kB/s 0:00:00 (xfr#1, to-chk=7/9) .nvchecker.toml 92 100% 89.84kB/s 0:00:00 92 100% 89.84kB/s 0:00:00 (xfr#2, to-chk=6/9) LICENSE 646 100% 630.86kB/s 0:00:00 646 100% 630.86kB/s 0:00:00 (xfr#3, to-chk=5/9) PKGBUILD 700 26% 683.59kB/s 0:00:00 2,603 100% 2.48MB/s 0:00:00 (xfr#4, to-chk=4/9) REUSE.toml 375 100% 366.21kB/s 0:00:00 375 100% 366.21kB/s 0:00:00 (xfr#5, to-chk=3/9) pastix-6.4.0-5.log 696 100% 679.69kB/s 0:00:00 696 100% 679.69kB/s 0:00:00 (xfr#6, to-chk=2/9) LICENSES/ sent 1,122 bytes received 210 bytes 888.00 bytes/sec total size is 5,030 speedup is 3.78 ==> Patching arch to riscv64... ==> Running pkgctl build --arch riscv64 on remote host... ==> WARNING: invalid architecture: riscv64 ==> Updating pacman database cache ==> Locking pacman database cache...done [?25l:: Synchronizing package databases... core downloading... extra downloading... multilib downloading... [?25h==> Building pastix  -> repo: extra  -> arch: riscv64  -> worker: felix-3 ==> Building pastix for [extra] (riscv64) ==> Locking clean chroot...done ]3008;start=eceada87138247038631b522099af39c;user=root;hostname=infernape.felixc.at;machineid=2e397cddc373469b84ba49094179ed95;bootid=c7b18150321b48f78049ddd8ba2aeb26;pid=3650462;pidfdid=23840903;comm=systemd-nspawn;container=arch-nspawn-3650462;type=container\]11;?\]2;🔵 Container arch-nspawn-3650462 on infernape.felixc.at\[?25l:: Synchronizing package databases... core downloading... extra downloading... :: Starting full system upgrade... there is nothing to do [?25h[!p]104\[?7h]3008;end=eceada87138247038631b522099af39c\==> Building in chroot for [extra] (riscv64)... ==> Synchronizing chroot copy [/var/lib/archbuild/extra-riscv64/root] -> [felix-3]...done ==> Making package: pastix 6.4.0-5 (Sun May 17 05:40:18 2026) ==> Retrieving sources...  -> Found pastix-6.4.0.tar.gz ==> Validating source files with b2sums... pastix-6.4.0.tar.gz ... Passed ]3008;start=b16a850594984e6d8ae41fe2a61f8fdf;user=root;hostname=infernape.felixc.at;machineid=2e397cddc373469b84ba49094179ed95;bootid=c7b18150321b48f78049ddd8ba2aeb26;pid=3655178;pidfdid=23198883;comm=systemd-nspawn;container=arch-nspawn-3655178;type=container\]11;?\]2;🔵 Container arch-nspawn-3655178 on infernape.felixc.at\==> Making package: pastix 6.4.0-5 (Sat May 16 21:40:46 2026) ==> Checking runtime dependencies... ==> Installing missing dependencies... [?25lresolving dependencies... looking for conflicting packages... Package (13) New Version Net Change extra/blas 3.12.1-2 0.43 MiB extra/lapack 3.12.1-2 9.09 MiB extra/libfabric 2.5.1-1 7.10 MiB extra/libpciaccess 0.19-1 0.05 MiB extra/numactl 2.0.19-1 0.20 MiB extra/openpmix 5.0.10-1 3.58 MiB extra/openucx 1.20.0-3 6.56 MiB extra/prrte 3.0.13-1 1.89 MiB extra/cblas 3.12.1-2 0.31 MiB extra/hwloc 2.13.0-1 1.51 MiB extra/lapacke 3.12.1-2 5.44 MiB extra/openmpi 5.0.10-2 9.26 MiB extra/scotch 7.0.11-4 1.55 MiB Total Installed Size: 46.99 MiB :: Proceed with installation? [Y/n] checking keyring... checking package integrity... loading package files... checking for file conflicts... :: Processing package changes... installing blas... installing cblas... installing libpciaccess... installing hwloc... Optional dependencies for hwloc cairo: PDF, Postscript, and PNG export support libxml2: full XML import/export support [installed] installing lapack... installing lapacke... installing numactl... installing libfabric... installing openpmix... Optional dependencies for openpmix openpmix-docs: for documentation installing openucx... Optional dependencies for openucx rdma-core: for InfiniBand and RDMA support rocm-language-runtime: for ROCm support installing prrte... Optional dependencies for prrte openssh: for execution on remote hosts via plm_ssh_agent prrte-docs: for documentation installing openmpi... Optional dependencies for openmpi hip-runtime-amd: ROCm support gcc-fortran: fortran support openssh: for execution on remote hosts via plm_ssh_agent installing scotch... :: Running post-transaction hooks... (1/1) Arming ConditionNeedsUpdate... [?25h==> Checking buildtime dependencies... ==> Installing missing dependencies... [?25lresolving dependencies... looking for conflicting packages... Package (31) New Version Net Change Download Size extra/clang 22.1.5-1 245.84 MiB extra/compiler-rt 22.1.5-1 166.71 MiB extra/cppdap 1.58.0-3 1.57 MiB extra/fmt 12.1.0-2 0.68 MiB extra/hicolor-icon-theme 0.18-1 0.05 MiB extra/jsoncpp 1.9.6-3 3.16 MiB core/libedit 20251016_3.1-1 0.25 MiB extra/libuv 1.52.1-1 0.62 MiB extra/llvm-libs 22.1.5-1 154.50 MiB extra/perl-error 0.17030-3 0.04 MiB extra/perl-mailtools 2.22-3 0.10 MiB extra/perl-timedate 2.35-1 0.15 MiB extra/python-certifi 2026.04.22-1 0.02 MiB 0.01 MiB extra/python-charset-normalizer 3.4.6-1 0.94 MiB extra/python-idna 3.14-1 0.88 MiB extra/python-packaging 26.2-1 1.23 MiB extra/python-platformdirs 4.9.6-1 0.40 MiB extra/python-pooch 1.9.0-1 0.75 MiB extra/python-requests 2.33.1-1 0.60 MiB extra/python-urllib3 2.6.3-1 1.44 MiB extra/rhash 1.4.6-1 0.35 MiB extra/spdlog 1.17.0-2 0.67 MiB extra/zlib-ng 2.3.3-1 0.23 MiB extra/cmake 4.3.2-1 85.40 MiB extra/doxygen 1.16.1-3 20.37 MiB core/gcc-fortran 16.1.1+r12+g301eb08fa2c5-1 73.32 MiB extra/git 2.54.0-1 29.36 MiB extra/ninja 1.13.2-3 0.36 MiB extra/python-mpi4py 4.1.1-2 2.85 MiB extra/python-numpy 2.4.4-1 41.12 MiB extra/python-scipy 1.17.1-1 110.75 MiB Total Download Size: 0.01 MiB Total Installed Size: 944.71 MiB :: Proceed with installation? [Y/n] :: Retrieving packages... python-certifi-2026.04.22-1-any downloading... checking keyring... checking package integrity... loading package files... checking for file conflicts... :: Processing package changes... installing gcc-fortran... installing cppdap... installing hicolor-icon-theme... installing jsoncpp... Optional dependencies for jsoncpp jsoncpp-doc: documentation installing libuv... installing rhash... installing cmake... Optional dependencies for cmake make: for unix Makefile generator [installed] ninja: for ninja generator [pending] qt6-base: cmake-gui installing ninja... installing libedit... installing llvm-libs... installing compiler-rt... installing clang... Optional dependencies for clang openmp: OpenMP support in clang with -fopenmp python: for scan-view and git-clang-format [installed] llvm: referenced by some clang headers installing fmt... installing spdlog... installing doxygen... Optional dependencies for doxygen graphviz: for caller/callee graph generation qt6-base: for doxywizard qt6-svg: for doxywizard texlive-fontsrecommended: for generating LaTeX, Postscript and PDF output texlive-fontutils: for generating LaTeX, Postscript and PDF output texlive-latexextra: for generating LaTeX, Postscript and PDF output texlive-plaingeneric: for generating LaTeX, Postscript and PDF output installing perl-error... installing perl-timedate... installing perl-mailtools... installing zlib-ng... installing git... Optional dependencies for git git-zsh-completion: upstream zsh completion tk: gitk and git gui openssh: ssh transport and crypto man: show help with `git command --help` perl-libwww: git svn perl-term-readkey: git svn and interactive.singlekey setting perl-io-socket-ssl: git send-email TLS support perl-authen-sasl: git send-email TLS support perl-cgi: gitweb (web interface) support python: git svn & git p4 [installed] subversion: git svn org.freedesktop.secrets: keyring credential helper libsecret: libsecret credential helper [installed] less: the default pager for git installing python-numpy... Optional dependencies for python-numpy blas-openblas: faster linear algebra installing python-mpi4py... installing python-platformdirs... installing python-packaging... installing python-charset-normalizer... installing python-idna... installing python-urllib3... Optional dependencies for python-urllib3 python-brotli: Brotli support python-brotlicffi: Brotli support python-h2: HTTP/2 support python-pysocks: SOCKS support installing python-certifi... installing python-requests... Optional dependencies for python-requests python-chardet: alternative character encoding library python-pysocks: SOCKS proxy support installing python-pooch... Optional dependencies for python-pooch python-paramiko: for SFTP downloads python-tqdm: for printing a download progress bar installing python-scipy... Optional dependencies for python-scipy python-pillow: for image saving module :: Running post-transaction hooks... (1/5) Creating system user accounts... Creating group 'git' with GID 969. Creating user 'git' (git daemon user) with UID 969 and GID 969. (2/5) Reloading system manager configuration... Skipped: Current root is not booted. (3/5) Arming ConditionNeedsUpdate... (4/5) Checking for old perl modules... (5/5) Updating the info directory file... [?25h==> Retrieving sources...  -> Found pastix-6.4.0.tar.gz ==> WARNING: Skipping all source file integrity checks. ==> Extracting sources...  -> Extracting pastix-6.4.0.tar.gz with bsdtar ==> Starting build()... -- The C compiler identification is GNU 16.1.1 -- The CXX compiler identification is GNU 16.1.1 -- The Fortran compiler identification is GNU 16.1.1 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Detecting Fortran compiler ABI info -- Detecting Fortran compiler ABI info - done -- Check for working Fortran compiler: /usr/bin/gfortran - skipped -- Found PkgConfig: /usr/bin/pkg-config (found version "2.5.1") -- Performing Test HAVE_C_TSAN -- Performing Test HAVE_C_TSAN - Success -- Performing Test HAVE_CXX_TSAN -- Performing Test HAVE_CXX_TSAN - Success -- Performing Test HAVE_C_ASAN -- Performing Test HAVE_C_ASAN - Success -- Performing Test HAVE_CXX_ASAN -- Performing Test HAVE_CXX_ASAN - Success -- Performing Test HAVE_C_LSAN -- Performing Test HAVE_C_LSAN - Success -- Performing Test HAVE_CXX_LSAN -- Performing Test HAVE_CXX_LSAN - Success -- Performing Test HAVE_C_MSAN -- Performing Test HAVE_C_MSAN - Failed -- Performing Test HAVE_CXX_MSAN -- Performing Test HAVE_CXX_MSAN - Failed -- Performing Test HAVE_C_UBSAN -- Performing Test HAVE_C_UBSAN - Success -- Performing Test HAVE_CXX_UBSAN -- Performing Test HAVE_CXX_UBSAN - Success -- Building for target riscv64 -- Performing Test HAVE_WALL -- Performing Test HAVE_WALL - Success -- Performing Test HAVE_WEXTRA -- Performing Test HAVE_WEXTRA - Success -- Performing Test HAVE_G3 -- Performing Test HAVE_G3 - Success -- Performing Test HAVE_ATOMIC_GCC_32_BUILTINS -- Performing Test HAVE_ATOMIC_GCC_32_BUILTINS - Success -- Performing Test HAVE_ATOMIC_GCC_64_BUILTINS -- Performing Test HAVE_ATOMIC_GCC_64_BUILTINS - Success -- Performing Test HAVE_ATOMIC_GCC_128_BUILTINS -- Performing Test HAVE_ATOMIC_GCC_128_BUILTINS - Failed -- Performing Test HAVE_ATOMIC_GCC_128_BUILTINS -- Performing Test HAVE_ATOMIC_GCC_128_BUILTINS - Failed -- Performing Test HAVE_ATOMIC_XLC_32_BUILTINS -- Performing Test HAVE_ATOMIC_XLC_32_BUILTINS - Failed -- Performing Test HAVE_ATOMIC_MIPOSPRO_32_BUILTINS -- Performing Test HAVE_ATOMIC_MIPOSPRO_32_BUILTINS - Failed -- Performing Test HAVE_ATOMIC_SUN_32 -- Performing Test HAVE_ATOMIC_SUN_32 - Failed -- support for 32 bits atomics - found -- support for 64 bits atomics - found CMake Warning at cmake_modules/CheckAtomicIntrinsic.cmake:183 (message): 128 bit atomics not found but pointers are 64 bits. Some list operations will not be optimized Call Stack (most recent call first): cmake_modules/CheckSystem.cmake:74 (include) CMakeLists.txt:94 (include) -- Performing Test HAVE_FALLTHROUGH -- Performing Test HAVE_FALLTHROUGH - Success -- Performing Test HAVE_BUILTIN_EXPECT -- Performing Test HAVE_BUILTIN_EXPECT - Success -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Looking for sched_setaffinity -- Looking for sched_setaffinity - found -- Performing Test HAVE_TIMESPEC_TV_NSEC -- Performing Test HAVE_TIMESPEC_TV_NSEC - Success -- Looking for clock_gettime in rt -- Looking for clock_gettime in rt - found -- Looking for include file stdarg.h -- Looking for include file stdarg.h - found -- Performing Test HAVE_VA_COPY -- Performing Test HAVE_VA_COPY - Success -- Looking for asprintf -- Looking for asprintf - found -- Looking for vasprintf -- Looking for vasprintf - found -- Looking for include file getopt.h -- Looking for include file getopt.h - found -- Looking for include file unistd.h -- Looking for include file unistd.h - found -- Looking for getopt_long -- Looking for getopt_long - found -- Looking for include file errno.h -- Looking for include file errno.h - found -- Looking for include file stddef.h -- Looking for include file stddef.h - found -- Looking for include file stdbool.h -- Looking for include file stdbool.h - found -- Looking for getrusage -- Looking for getrusage - found -- Looking for RUSAGE_THREAD -- Looking for RUSAGE_THREAD - not found -- Looking for include file limits.h -- Looking for include file limits.h - found -- Looking for include file string.h -- Looking for include file string.h - found -- Looking for include file libgen.h -- Looking for include file libgen.h - found -- Looking for include file complex.h -- Looking for include file complex.h - found -- Looking for include file sys/param.h -- Looking for include file sys/param.h - found -- Looking for include file sys/types.h -- Looking for include file sys/types.h - found -- Looking for include file syslog.h -- Looking for include file syslog.h - found -- Looking for getline -- Looking for getline - found -- Looking for mkdtemp -- Looking for mkdtemp - found -- Performing Test HAVE_FTZ_MACROS -- Performing Test HAVE_FTZ_MACROS - Failed -- Performing Test HAVE_DAZ_MACROS -- Performing Test HAVE_DAZ_MACROS - Failed -- Performing Test HAVE_MM_SETCSR -- Performing Test HAVE_MM_SETCSR - Failed -- Looking for sqrt -- Looking for sqrt - not found -- Looking for sqrt -- Looking for sqrt - found -- Found M: /lib/libm.so -- Looking for Fortran sgemm -- Looking for Fortran sgemm - not found -- Looking for Fortran sgemm -- Looking for Fortran sgemm - found -- CBLAS_BLAS BLAS -- Looking for cblas_dscal -- Looking for cblas_dscal - not found -- CBLAS_WORKS -- Looking for cblas : test with blas failed or CBLAS_STANDALONE enabled -- Checking for one of the modules 'cblas' -- Looking for CBLAS - found using PkgConfig -- Looking for cblas_dscal -- Looking for cblas_dscal - found -- Looking for cblas_zgemm3m -- Looking for cblas_zgemm3m - not found -- Looking for cblas_cgemm3m -- Looking for cblas_cgemm3m - not found -- Found CBLAS: /usr/lib/libcblas.so -- Looking for Fortran cheev -- Looking for Fortran cheev - not found -- Looking for Fortran cheev -- Looking for Fortran cheev - found -- Looking for LAPACKE_dgeqrf -- Looking for LAPACKE_dgeqrf - not found -- Looking for lapacke : test with lapack fails -- Checking for one of the modules 'lapacke' -- Looking for LAPACKE - found using PkgConfig -- Looking for LAPACKE_dgeqrf -- Looking for LAPACKE_dgeqrf - found -- Looking for LAPACKE_dlascl_work -- Looking for LAPACKE_dlascl_work - found -- Found LAPACKE: /usr/lib/liblapacke.so -- FindHWLOC needs pkg-config program and PKG_CONFIG_PATH set HWLOC.pc file path. -- Checking for one of the modules 'hwloc' -- Looking for HWLOC - found using PkgConfig -- Looking for hwloc_topology_init -- Looking for hwloc_topology_init - found -- Found HWLOC: /usr/lib/libhwloc.so -- Found MPI_C: /lib/libmpi.so (found version "3.1") -- Found MPI_CXX: /lib/libmpi.so (found version "3.1") -- Found MPI_Fortran: /lib/libmpi_usempif08.so (found version "3.1") -- Found MPI: TRUE (found version "3.1") -- Looking for SCOTCH_graphInit -- Looking for SCOTCH_graphInit - found -- Looking for SCOTCH_contextInit -- Looking for SCOTCH_contextInit - found -- Performing Test SCOTCH_Num_4 -- Performing Test SCOTCH_Num_4 - Success -- Performing Test SCOTCH_Num_8 -- Performing Test SCOTCH_Num_8 - Failed -- Found SCOTCH: /lib/libscotch.so;/lib/libscotcherr.so;/lib/libz.so;/lib/libm.so;/lib/librt.a -- Checking for one of the modules 'gtg' -- Use internal SPM -- Building for target riscv64 -- CBLAS_BLAS BLAS -- Looking for cblas_dscal -- Looking for cblas_dscal - found -- Looking for cblas_zgemm3m -- Looking for cblas_zgemm3m - not found -- Looking for cblas_cgemm3m -- Looking for cblas_cgemm3m - not found -- CBLAS_WORKS 1 -- Looking for cblas: test with blas succeeded -- Found CBLAS: /usr/lib/libcblas.so;/lib/libblas.so -- Looking for LAPACKE_dgeqrf -- Looking for LAPACKE_dgeqrf - found -- Looking for LAPACKE_dlascl_work -- Looking for LAPACKE_dlascl_work - found -- Looking for lapacke: test with lapack succeeded -- Found LAPACKE: /usr/lib/liblapacke.so;/lib/liblapack.so;/lib/libblas.so -- Looking for LAPACKE_zlassq_work -- Looking for LAPACKE_zlassq_work - found -- Add definition LAPACKE_WITH_LASSQ -- Add definition LAPACKE_WITH_LASCL -- Performing Test SPM_MPI_COMM_C_4 -- Performing Test SPM_MPI_COMM_C_4 - Failed -- Performing Test SPM_MPI_COMM_C_8 -- Performing Test SPM_MPI_COMM_C_8 - Success -- Looking for SCOTCH_graphInit -- Looking for SCOTCH_graphInit - found -- Looking for SCOTCH_contextInit -- Looking for SCOTCH_contextInit - found -- Performing Test SCOTCH_Num_4 -- Performing Test SCOTCH_Num_4 - Success -- Performing Test SCOTCH_Num_8 -- Performing Test SCOTCH_Num_8 - Failed -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/spm -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/spm - Done -- Building for target riscv64 -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/spm/src -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/spm/src - Done -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/spm/tests -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/spm/tests - Done -- No installation of spm_env.sh - already in the default environment -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/kernels -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/kernels - Done -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/kernels -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/kernels - Done -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/kernels -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/kernels - Done -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/kernels -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/kernels - Done -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/refinement -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/refinement - Done -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0 -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0 - Done -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0 -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0 - Done -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0 -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0 - Done -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0 -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0 - Done -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0 -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0 - Done -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0 -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0 - Done -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0 -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0 - Done -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/test -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/test - Done -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/test -- Generate precision dependencies in /build/pastix/src/pastix-6.4.0/test - Done -- Found OpenMP_C: -fopenmp (found version "5.2") -- Found OpenMP_CXX: -fopenmp (found version "5.2") -- Found OpenMP_Fortran: -fopenmp (found version "5.2") -- Found OpenMP: TRUE (found version "5.2") -- Found UnixCommands: /usr/bin/bash -- No installation of pastix_env.sh - already in the default environment -- Found Doxygen: /usr/bin/doxygen (found version "1.16.1") found components: doxygen missing components: dot Configuration of Pastix: PASTIX_VERSION ......: 6.4.0 BUILD_TYPE ..........: None BUILDNAME ...........: Linux-amd64-cc-None-MPI SITE ................: arch-nspawn-3655178 Compiler: C .........: /usr/bin/cc (GNU) Compiler: Fortran ...: /usr/bin/gfortran (GNU) Compiler: MPI .......: /usr/bin/mpicc Compiler flags ......: Linker: .............: /usr/bin/ld Build type ..........: None Build shared ........: ON CFlags ..............: -march=rv64gc -mabi=lp64d -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=3 -Wformat -Werror=format-security -fstack-clash-protection -fno-omit-frame-pointer -g -ffile-prefix-map=/build/pastix/src=/usr/src/debug/pastix -flto=auto -Wall -Wextra -Drestrict= LDFlags .............: -Wl,-O1 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -flto=auto EXE LDFlags .........: -Wl,-O1 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -flto=auto Implementation paradigm MPI .................: ON CUDA ................: OFF Ordering selected SCOTCH ..............: ON Mutithreaded ......: ON PTSCOTCH ............: OFF METIS ...............: OFF Runtime specific PARSEC ..............: OFF STARPU ..............: OFF Kernels specific CBLAS ...............: TRUE LAPACKE .............: TRUE HWLOC ...............: TRUE Trace ...............: OFF Binaries to build documentation .......: ON testing .............: ON precisions ..........: sdcz INSTALL_PREFIX ......: /usr -- Configuration is done - A summary of the current configuration has been written in /build/pastix/src/build/config.log -- Configuring done (487.4s) -- Generating done (1.1s) -- Build files have been written to: /build/pastix/src/build [1/966] Generating include/spm/c_spm.h [2/966] Generating include/spm/d_spm.h [3/966] Generating c_nan_check.h [4/966] Generating cpucblk_cpack.h [5/966] Generating pastix_dscores.h [6/966] Generating include/spm/s_spm.h [7/966] Generating cpucblk_dpack.h [8/966] Generating include/spm/z_spm.h [9/966] Generating pastix_zccores.h [10/966] Generating cpucblk_spack.h [11/966] Generating cpucblk_zpack.h [12/966] Generating d_nan_check.h [13/966] Generating pastix_ccuda.h [14/966] Generating pastix_dcuda.h [15/966] Generating pastix_scuda.h [16/966] Generating pastix_zcuda.h [17/966] Generating s_nan_check.h [18/966] Generating pastix_clrcores.h [19/966] Generating z_nan_check.h [20/966] Generating pastix_dlrcores.h [21/966] Generating sopalin/coeftab_d.h [22/966] Generating sopalin/coeftab_c.h [23/966] Generating sopalin/coeftab_s.h [24/966] Generating sopalin/coeftab_z.h [25/966] Generating pastix_slrcores.h [26/966] Generating bcsc/bcsc_d.h [27/966] Generating bcsc/bcsc_c.h [28/966] Generating bcsc/bcsc_s.h [29/966] Generating bcsc/bcsc_z.h [30/966] Generating pastix_ccores.h [31/966] Generating common/d_integer.c [32/966] Generating pastix_dcores.h [33/966] Generating common/c_integer.c [34/966] Generating pastix_zlrcores.h [35/966] Generating common/s_integer.c [36/966] Generating pastix_scores.h [37/966] Generating common/z_integer.c [38/966] Generating pastix_zcores.h [39/966] Generating bcsc/bcsc_dnorm.c [40/966] Generating bcsc/bcsc_cnorm.c [41/966] Generating bcsc/bcsc_snorm.c [42/966] Generating bcsc/bcsc_znorm.c [43/966] Generating bcsc/bcsc_dspmv.c [44/966] Generating bcsc/bcsc_cspmv.c [45/966] Generating bcsc/bcsc_sspmv.c [46/966] Generating bcsc/bcsc_zspmv.c [47/966] Generating sopalin/coeftab_cinit.c [48/966] Generating sopalin/coeftab_dinit.c [49/966] Generating sopalin/coeftab_sinit.c [50/966] Generating bcsc/bvec_dmpi_comm.c [51/966] Generating sopalin/coeftab_zinit.c [52/966] Generating bcsc/bvec_cmpi_comm.c [53/966] Generating bcsc/bvec_zmpi_comm.c [54/966] Generating bcsc/bvec_smpi_comm.c [55/966] Generating sopalin/sequential_sdiag.c [56/966] Generating sopalin/sequential_ddiag.c [57/966] Generating sopalin/sequential_zdiag.c [58/966] Generating bcsc/bvec_dlapmr.c [59/966] Generating bcsc/bvec_clapmr.c [60/966] Generating sopalin/coeftab_d.c [61/966] Generating bcsc/bvec_slapmr.c [62/966] Generating bcsc/bvec_zlapmr.c [63/966] Generating sopalin/coeftab_zcinit.c [64/966] Generating sopalin/sequential_dgetrf.c [65/966] Generating sopalin/sequential_cgetrf.c [66/966] Generating sopalin/coeftab_c.c [67/966] Generating sopalin/coeftab_s.c [68/966] Generating sopalin/coeftab_dsinit.c [69/966] Generating sopalin/sequential_zgetrf.c [70/966] Generating sopalin/sequential_sgetrf.c [71/966] Generating sopalin/coeftab_z.c [72/966] Generating bcsc/bcsc_dinit.c [73/966] Generating sopalin/sequential_chetrf.c [74/966] Generating bcsc/bcsc_cinit.c [75/966] Generating bcsc/bcsc_sinit.c [76/966] Generating sopalin/sequential_zhetrf.c [77/966] Generating sopalin/sequential_cpotrf.c [78/966] Generating sopalin/sequential_dpotrf.c [79/966] Generating bcsc/bvec_ccompute.c [80/966] Generating bcsc/bcsc_zinit.c [81/966] Generating sopalin/sequential_spotrf.c [82/966] Generating bcsc/bvec_dcompute.c [83/966] Generating sopalin/sequential_cdiag.c [84/966] Generating sopalin/sequential_zpotrf.c [85/966] Generating bcsc/bvec_scompute.c [86/966] Generating include/spm/p_spm.h [87/966] Generating bcsc/bvec_zcompute.c [88/966] Generating sopalin/sequential_cpxtrf.c [89/966] Generating sopalin/sequential_zpxtrf.c [90/966] Generating sopalin/sequential_dsytrf.c [91/966] Generating sopalin/sequential_csytrf.c [92/966] Generating sopalin/sequential_ssytrf.c [93/966] Generating sopalin/sequential_zsytrf.c [94/966] Generating sopalin/sequential_dtrsm.c [95/966] Generating sopalin/sequential_ctrsm.c [96/966] Generating sopalin/sequential_strsm.c [97/966] Generating sopalin/sequential_ztrsm.c [98/966] Generating refinement/d_refine_bicgstab.c [99/966] Generating refinement/c_refine_bicgstab.c [100/966] Generating refinement/s_refine_bicgstab.c [101/966] Generating refinement/z_refine_bicgstab.c [102/966] Generating refinement/d_refine_functions.c [103/966] Generating refinement/c_refine_functions.c [104/966] Generating refinement/s_refine_functions.c [105/966] Generating refinement/z_refine_functions.c [106/966] Generating refinement/d_refine_gmres.c [107/966] Generating refinement/c_refine_gmres.c [108/966] Generating refinement/s_refine_gmres.c [109/966] Generating refinement/z_refine_gmres.c [110/966] Generating refinement/d_refine_grad.c [111/966] Generating refinement/c_refine_grad.c [112/966] Generating refinement/s_refine_grad.c [113/966] Generating refinement/z_refine_grad.c [114/966] Generating refinement/d_refine_pivot.c [115/966] Generating refinement/c_refine_pivot.c [116/966] Generating refinement/s_refine_pivot.c [117/966] Generating refinement/z_refine_pivot.c [118/966] Generating c_refine_functions.h [119/966] Generating d_refine_functions.h [120/966] Generating s_refine_functions.h [121/966] Generating z_refine_functions.h [122/966] Generating c_tests.h [123/966] Generating d_tests.h [124/966] Generating s_tests.h [125/966] Generating z_tests.h [126/966] Generating d_spm_dof_extend.c [127/966] Generating c_spm_dof_extend.c [128/966] Generating s_spm_dof_extend.c [129/966] Generating c_spm_scal.c [130/966] Generating z_spm_dof_extend.c [131/966] Generating d_spm_scal.c [132/966] Generating s_spm_scal.c [133/966] Generating z_spm_scal.c [134/966] Generating d_spm_2dense.c [135/966] Generating c_spm_2dense.c [136/966] Generating d_spm_rhs.c [137/966] Generating s_spm_2dense.c [138/966] Generating z_spm_2dense.c [139/966] Generating c_spm_rhs.c [140/966] Generating s_spm_rhs.c [141/966] Generating d_spm_convert_to_csr.c [142/966] Generating z_spm_rhs.c [143/966] Generating s_spm_convert_to_csr.c [144/966] Generating z_spm_convert_to_csr.c [145/966] Generating c_spm_convert_to_csr.c [146/966] Generating p_spm_convert_to_csr.c [147/966] Generating s_spm_convert_to_ijv.c [148/966] Generating d_spm_convert_to_ijv.c [149/966] Generating z_spm_convert_to_ijv.c [150/966] Generating p_spm_convert_to_ijv.c [151/966] Generating c_spm_convert_to_ijv.c [152/966] Generating s_spm_convert_to_csc.c [153/966] Generating z_spm_convert_to_csc.c [154/966] Generating d_spm_convert_to_csc.c [155/966] Generating d_spm_norm.c [156/966] Generating c_spm_norm.c [157/966] Generating c_spm_convert_to_csc.c [158/966] Generating p_spm_convert_to_csc.c [159/966] Generating z_spm_norm.c [160/966] Generating s_spm_norm.c [161/966] Generating c_spm_genrhs.c [162/966] Generating d_spm_genrhs.c [163/966] Generating s_spm_genrhs.c [164/966] Generating d_spm_expand.c [165/966] Generating c_spm_integer.c [166/966] Generating z_spm_genrhs.c [167/966] Generating d_spm_integer.c [168/966] Generating s_spm_expand.c [169/966] Generating c_spm_expand.c [170/966] Generating s_spm_integer.c [171/966] Generating z_spm_integer.c [172/966] Generating p_spm_expand.c [173/966] Generating z_spm_expand.c [174/966] Generating s_spm_mergeduplicate.c [175/966] Generating d_spm_mergeduplicate.c [176/966] Generating z_spm_mergeduplicate.c [177/966] Generating p_spm_mergeduplicate.c [178/966] Generating c_spm_mergeduplicate.c [179/966] Generating d_spm_genmat.c [180/966] Generating c_spm_genmat.c [181/966] Generating z_spm_genmat.c [182/966] Generating s_spm_genmat.c [183/966] Generating d_spm_laplacian.c [184/966] Generating z_spm_laplacian.c [185/966] Generating s_spm_laplacian.c [186/966] Generating c_spm_laplacian.c [187/966] Generating p_spm_laplacian.c [188/966] Generating d_spm_sort.c [189/966] Generating d_spm_print.c [190/966] Generating c_spm_sort.c [191/966] Generating s_spm_sort.c [192/966] Generating p_spm_sort.c [193/966] Generating z_spm_sort.c [194/966] Generating z_spm_print.c [195/966] Generating s_spm_print.c [196/966] Generating p_spm_print.c [197/966] Generating c_spm_print.c [198/966] Generating s_spm_matrixvector.c [199/966] Generating c_spm_matrixvector.c [200/966] Generating z_spm_matrixvector.c [201/966] Generating d_spm_matrixvector.c [202/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_scal.c.o [203/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_scal.c.o [204/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_scal.c.o [205/966] Building Fortran preprocessed spm/wrappers/fortran90/CMakeFiles/spmf.dir/src/spmf_bindings.f90-pp.f90 [206/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_scal.c.o [207/966] Building Fortran preprocessed spm/wrappers/fortran90/CMakeFiles/spmf.dir/src/spmf_enums.F90-pp.f90 [208/966] Building Fortran preprocessed spm/wrappers/fortran90/CMakeFiles/spmf.dir/src/spmf_functions.f90-pp.f90 [209/966] Building Fortran preprocessed spm/wrappers/fortran90/CMakeFiles/spmf.dir/src/spmf_interfaces.f90-pp.f90 [210/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_convert_to_csr.c.o [211/966] Building Fortran preprocessed spm/wrappers/fortran90/CMakeFiles/spmf.dir/src/spmf.f90-pp.f90 [212/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_convert_to_csr.c.o [213/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_convert_to_csr.c.o [214/966] Building C object spm/src/CMakeFiles/spm.dir/p_spm_convert_to_csr.c.o [215/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_convert_to_csr.c.o [216/966] Building Fortran preprocessed spm/wrappers/fortran90/CMakeFiles/spmf_driver.dir/examples/spmf_driver.F90-pp.f90 [217/966] Building Fortran preprocessed spm/wrappers/fortran90/CMakeFiles/spmf_user.dir/examples/spmf_user.F90-pp.f90 [218/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_dof_extend.c.o [219/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_dof_extend.c.o [220/966] Building Fortran preprocessed spm/wrappers/fortran90/CMakeFiles/spmf_rebalance.dir/examples/spmf_rebalance.F90-pp.f90 [221/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_dof_extend.c.o [222/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_2dense.c.o [223/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_2dense.c.o [224/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_2dense.c.o [225/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_2dense.c.o [226/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_rhs.c.o [227/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_convert_to_ijv.c.o [228/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_convert_to_ijv.c.o [229/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_dof_extend.c.o [230/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_convert_to_ijv.c.o [231/966] Generating Fortran dyndep file spm/wrappers/fortran90/CMakeFiles/spmf.dir/Fortran.dd [232/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_rhs.c.o [233/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_rhs.c.o [234/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_rhs.c.o [235/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_convert_to_ijv.c.o [236/966] Building C object spm/src/CMakeFiles/spm.dir/p_spm_convert_to_ijv.c.o [237/966] Building C object spm/src/CMakeFiles/spm.dir/p_spm_convert_to_csc.c.o [238/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_convert_to_csc.c.o [239/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_convert_to_csc.c.o [240/966] Generating Fortran dyndep file spm/wrappers/fortran90/CMakeFiles/spmf_driver.dir/Fortran.dd [241/966] Generating Fortran dyndep file spm/wrappers/fortran90/CMakeFiles/spmf_user.dir/Fortran.dd [242/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_convert_to_csc.c.o [243/966] Generating Fortran dyndep file spm/wrappers/fortran90/CMakeFiles/spmf_rebalance.dir/Fortran.dd [244/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_convert_to_csc.c.o [245/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_mergeduplicate.c.o [246/966] Building C object spm/src/CMakeFiles/spm.dir/p_spm_mergeduplicate.c.o [247/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_integer.c.o [248/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_mergeduplicate.c.o [249/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_expand.c.o [250/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_integer.c.o [251/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_expand.c.o [252/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_mergeduplicate.c.o [253/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_mergeduplicate.c.o [254/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_integer.c.o [255/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_expand.c.o [256/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_integer.c.o [257/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_laplacian.c.o [258/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_expand.c.o [259/966] Building C object spm/src/CMakeFiles/spm.dir/p_spm_expand.c.o /build/pastix/src/build/spm/src/p_spm_expand.c: In function ‘p_spmIJVExpand’: /build/pastix/src/build/spm/src/p_spm_expand.c:414:10: warning: variable ‘oldval’ set but not used [-Wunused-but-set-variable=] 414 | int *oldval = NULL; | ^~~~~~ [260/966] Building C object spm/src/CMakeFiles/spm.dir/p_spm_laplacian.c.o [261/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_laplacian.c.o [262/966] Building C object spm/src/CMakeFiles/spm.dir/spm_rhs.c.o [263/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_laplacian.c.o [264/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_laplacian.c.o [265/966] Building C object spm/src/CMakeFiles/spm.dir/spm_dof_extend.c.o [266/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_sort.c.o [267/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_sort.c.o [268/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_print.c.o [269/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_print.c.o [270/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_norm.c.o [271/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_sort.c.o [272/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_sort.c.o [273/966] Building C object spm/src/CMakeFiles/spm.dir/spm_degree.c.o [274/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_norm.c.o [275/966] Building C object spm/src/CMakeFiles/spm.dir/drivers/readhb.c.o [276/966] Building C object spm/src/CMakeFiles/spm.dir/p_spm_sort.c.o [277/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_print.c.o [278/966] Building C object spm/src/CMakeFiles/spm.dir/spm_read_driver.c.o [279/966] Building C object spm/src/CMakeFiles/spm.dir/p_spm_print.c.o /build/pastix/src/build/spm/src/p_spm_print.c: In function ‘p_spm_print_elt_sym_diag’: /build/pastix/src/build/spm/src/p_spm_print.c:60:38: warning: parameter ‘valptr’ set but not used [-Wunused-but-set-parameter=] 60 | const int *valptr, | ~~~~~~~~~~~^~~~~~ /build/pastix/src/build/spm/src/p_spm_print.c: In function ‘p_spm_print_elt_gen_col’: /build/pastix/src/build/spm/src/p_spm_print.c:122:37: warning: parameter ‘valptr’ set but not used [-Wunused-but-set-parameter=] 122 | const int *valptr, | ~~~~~~~~~~~^~~~~~ /build/pastix/src/build/spm/src/p_spm_print.c: In function ‘p_spm_print_elt_gen_row’: /build/pastix/src/build/spm/src/p_spm_print.c:172:37: warning: parameter ‘valptr’ set but not used [-Wunused-but-set-parameter=] 172 | const int *valptr, | ~~~~~~~~~~~^~~~~~ /build/pastix/src/build/spm/src/p_spm_print.c: In function ‘p_spmPrintRHS’: /build/pastix/src/build/spm/src/p_spm_print.c:618:16: warning: variable ‘xptr’ set but not used [-Wunused-but-set-variable=] 618 | const int *xptr = (const int *)x; | ^~~~ [280/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_print.c.o [281/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_norm.c.o [282/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_genrhs.c.o [283/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_genrhs.c.o [284/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_genrhs.c.o [285/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_genrhs.c.o [286/966] Building C object spm/examples/CMakeFiles/example_drivers.dir/example_drivers.c.o [287/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_norm.c.o [288/966] Building C object spm/src/CMakeFiles/spm.dir/spm_update_compute_fields.c.o [289/966] Building C object spm/src/CMakeFiles/spm.dir/drivers/laplacian.c.o [290/966] Building C object spm/wrappers/fortran90/CMakeFiles/spmf.dir/src/spm_f2c.c.o [291/966] Building C object spm/src/CMakeFiles/spm.dir/drivers/readijv.c.o [292/966] Building C object spm/src/CMakeFiles/spm.dir/drivers/readmm.c.o [293/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_genmat.c.o /build/pastix/src/build/spm/src/d_spm_genmat.c: In function ‘Rnd64_jump’: /build/pastix/src/build/spm/src/d_spm_genmat.c:53:7: warning: variable ‘i’ set but not used [-Wunused-but-set-variable=] 53 | int i; | ^ [294/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_genmat.c.o /build/pastix/src/build/spm/src/s_spm_genmat.c: In function ‘Rnd64_jump’: /build/pastix/src/build/spm/src/s_spm_genmat.c:53:7: warning: variable ‘i’ set but not used [-Wunused-but-set-variable=] 53 | int i; | ^ [295/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_genmat.c.o /build/pastix/src/build/spm/src/c_spm_genmat.c: In function ‘Rnd64_jump’: /build/pastix/src/build/spm/src/c_spm_genmat.c:53:7: warning: variable ‘i’ set but not used [-Wunused-but-set-variable=] 53 | int i; | ^ [296/966] Building C object spm/examples/CMakeFiles/example_lap2.dir/example_lap2.c.o [297/966] Building C object spm/examples/CMakeFiles/example_lap1.dir/example_lap1.c.o [298/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_genmat.c.o /build/pastix/src/build/spm/src/z_spm_genmat.c: In function ‘Rnd64_jump’: /build/pastix/src/build/spm/src/z_spm_genmat.c:53:7: warning: variable ‘i’ set but not used [-Wunused-but-set-variable=] 53 | int i; | ^ [299/966] Building C object spm/examples/CMakeFiles/example_mdof2.dir/example_mdof2.c.o [300/966] Building C object spm/examples/CMakeFiles/example_mdof1.dir/example_mdof1.c.o [301/966] Building C object spm/src/CMakeFiles/spm.dir/spm_gen_fake_values.c.o [302/966] Building C object spm/src/CMakeFiles/spm.dir/drivers/mmio.c.o [303/966] Building C object spm/src/CMakeFiles/spm.dir/spm_symmetrize.c.o [304/966] Building C object spm/src/CMakeFiles/spm.dir/spm_gather.c.o [305/966] Building C object spm/src/CMakeFiles/spm.dir/c_spm_matrixvector.c.o [306/966] Building C object spm/src/CMakeFiles/spm.dir/s_spm_matrixvector.c.o [307/966] Building C object spm/src/CMakeFiles/spm.dir/spm_integers.c.o [308/966] Building C object spm/src/CMakeFiles/spm.dir/d_spm_matrixvector.c.o [309/966] Building C object spm/src/CMakeFiles/spm.dir/z_spm_matrixvector.c.o [310/966] Building C object spm/src/CMakeFiles/spm.dir/spm_io.c.o /build/pastix/src/pastix-6.4.0/spm/src/spm_io.c: In function ‘spm_load_local’: /build/pastix/src/pastix-6.4.0/spm/src/spm_io.c:501:27: warning: comparison is always false due to limited range of data type [-Wtype-limits] 501 | if ( (line[i] == EOF ) || (line[i] == '\n') ) | ^~ [311/966] Building C object spm/src/CMakeFiles/spm.dir/spm_redistribute.c.o [312/966] Building C object spm/src/CMakeFiles/spm.dir/spm_scatter.c.o [313/966] Building Fortran object spm/wrappers/fortran90/CMakeFiles/spmf.dir/src/spmf_enums.F90.o [314/966] Building C object spm/src/CMakeFiles/spm.dir/spm.c.o [315/966] Building C object spm/src/CMakeFiles/spm.dir/drivers/iohb.c.o [316/966] Building Fortran object spm/wrappers/fortran90/CMakeFiles/spmf.dir/src/spmf_bindings.f90.o [317/966] Building Fortran object spm/wrappers/fortran90/CMakeFiles/spmf.dir/src/spmf_interfaces.f90.o [318/966] Building Fortran object spm/wrappers/fortran90/CMakeFiles/spmf.dir/src/spmf.f90.o [319/966] Building Fortran object spm/wrappers/fortran90/CMakeFiles/spmf_rebalance.dir/examples/spmf_rebalance.F90.o [320/966] Building Fortran object spm/wrappers/fortran90/CMakeFiles/spmf_driver.dir/examples/spmf_driver.F90.o [321/966] Building Fortran object spm/wrappers/fortran90/CMakeFiles/spmf_user.dir/examples/spmf_user.F90.o [322/966] Building Fortran object spm/wrappers/fortran90/CMakeFiles/spmf.dir/src/spmf_functions.f90.o [323/966] Linking C shared library spm/src/libspm.so.1.2.4 [324/966] Creating library symlink spm/src/libspm.so.1 spm/src/libspm.so [325/966] Generating core_dgemdm.c [326/966] Generating core_cgemdm.c [327/966] Generating core_sgemdm.c [328/966] Generating core_zgemdm.c [329/966] Generating core_dgetmo.c [330/966] Generating core_cgetmo.c [331/966] Generating core_sgetmo.c [332/966] Generating core_zgetmo.c [333/966] Generating core_dgeadd.c [334/966] Generating core_cgeadd.c [335/966] Generating core_sgeadd.c [336/966] Generating core_zgeadd.c [337/966] Generating core_dplrnt.c [338/966] Generating core_cplrnt.c [339/966] Generating core_splrnt.c [340/966] Generating core_zplrnt.c [341/966] Generating core_dtradd.c [342/966] Generating core_ctradd.c [343/966] Generating core_stradd.c [344/966] Generating core_ztradd.c [345/966] Generating core_dtrsmsp.c [346/966] Generating core_ctrsmsp.c [347/966] Generating core_strsmsp.c [348/966] Generating core_dscalo.c [349/966] Generating core_cscalo.c [350/966] Generating core_sscalo.c [351/966] Generating core_zscalo.c [352/966] Generating core_ztrsmsp.c [353/966] Generating core_dsytrfsp.c [354/966] Generating core_dlrnrm.c [355/966] Generating core_dpotrfsp.c [356/966] Generating core_ssytrfsp.c [357/966] Generating core_clrnrm.c [358/966] Generating core_zsytrfsp.c [359/966] Generating core_slrnrm.c [360/966] Generating core_cpotrfsp.c [361/966] Generating core_zlrnrm.c [362/966] Generating core_chetrfsp.c [363/966] Generating core_zpotrfsp.c [364/966] Generating core_spotrfsp.c [365/966] Generating core_cpxtrfsp.c [366/966] Generating core_zhetrfsp.c [367/966] Generating core_zpxtrfsp.c [368/966] Generating core_clrdbg.c [369/966] Generating core_dlrdbg.c [370/966] Generating core_dgetrfsp.c [371/966] Generating core_slrdbg.c [372/966] Generating core_cgetrfsp.c [373/966] Generating core_zlrdbg.c [374/966] Generating core_zgetrfsp.c [375/966] Generating core_sgetrfsp.c [376/966] Generating core_csytrfsp.c [377/966] Generating core_dlr2xx.c [378/966] Generating core_clr2xx.c [379/966] Generating core_slr2xx.c [380/966] Generating core_dxx2fr.c [381/966] Generating core_sxx2fr.c [382/966] Generating core_zlr2xx.c [383/966] Generating core_dgemmsp.c [384/966] Generating core_zgemmsp.c [385/966] Generating core_cxx2fr.c [386/966] Generating core_sgemmsp.c [387/966] Generating core_dlrmm.c [388/966] Generating core_cgemmsp.c [389/966] Generating core_clrmm.c [390/966] Generating core_slrmm.c [391/966] Generating core_zxx2fr.c [392/966] Generating core_cxx2lr.c [393/966] Generating core_dxx2lr.c [394/966] Generating core_zlrmm.c [395/966] Generating core_zxx2lr.c [396/966] Generating core_sxx2lr.c [397/966] Generating core_cpqrcp.c [398/966] Generating core_dpqrcp.c [399/966] Generating core_spqrcp.c [400/966] Generating core_drqrcp.c [401/966] Generating core_crqrcp.c [402/966] Generating core_srqrcp.c [403/966] Generating core_zrqrcp.c [404/966] Generating core_zpqrcp.c [405/966] Generating core_dtqrcp.c [406/966] Generating core_srqrrt.c [407/966] Generating core_drqrrt.c [408/966] Generating core_ctqrcp.c [409/966] Generating core_crqrrt.c [410/966] Generating core_stqrcp.c [411/966] Generating core_zrqrrt.c [412/966] Generating core_ztqrcp.c [413/966] Generating core_dlrothu.c [414/966] Generating core_clrothu.c [415/966] Generating core_slrothu.c [416/966] Generating core_zlrothu.c [417/966] Generating cpucblk_ddiff.c [418/966] Generating cpucblk_cdiff.c [419/966] Generating cpucblk_zdiff.c [420/966] Generating cpucblk_dcompress.c [421/966] Generating cpucblk_sdiff.c [422/966] Generating cpucblk_zcompress.c [423/966] Generating cpucblk_ccompress.c [424/966] Generating core_cgelrops_svd.c [425/966] Generating core_dgelrops_svd.c [426/966] Generating core_sgelrops_svd.c [427/966] Generating cpucblk_scompress.c [428/966] Generating core_zgelrops_svd.c [429/966] Generating cpucblk_dinit.c [430/966] Generating cpucblk_cschur.c [431/966] Generating cpucblk_dschur.c [432/966] Generating cpucblk_zschur.c [433/966] Generating cpucblk_sschur.c [434/966] Generating cpucblk_sinit.c [435/966] Generating cpucblk_cinit.c [436/966] Generating cpucblk_zinit.c [437/966] Generating cpucblk_dadd.c [438/966] Generating cpucblk_sadd.c [439/966] Generating cpucblk_cadd.c [440/966] Generating cpucblk_zadd.c [441/966] Generating cpublok_dadd.c [442/966] Generating cpucblk_zcinit.c [443/966] Generating cpublok_cadd.c [444/966] Generating cpublok_sadd.c [445/966] Generating cpublok_zadd.c [446/966] Generating cpucblk_dsinit.c [447/966] Generating cpucblk_cmpi_coeftab.c [448/966] Generating cpucblk_dmpi_coeftab.c [449/966] Generating cpucblk_cpack.c [450/966] Generating cpucblk_zmpi_coeftab.c [451/966] Generating cpucblk_smpi_coeftab.c [452/966] Generating cpucblk_dpack.c [453/966] Generating cpucblk_dmpi_rhs_bwd.c [454/966] Generating cpucblk_cmpi_rhs_bwd.c [455/966] Generating cpucblk_zpack.c [456/966] Generating cpucblk_spack.c [457/966] Generating cpucblk_smpi_rhs_bwd.c [458/966] Generating core_cgelrops.c [459/966] Generating cpucblk_zmpi_rhs_bwd.c [460/966] Generating cpucblk_cmpi_rhs_fwd.c [461/966] Generating core_zgelrops.c [462/966] Generating core_cgeadd.c [463/966] Generating core_dgeadd.c [464/966] Generating core_splrnt.c [465/966] Generating core_sgelrops.c [466/966] Generating cpucblk_dmpi_rhs_fwd.c [467/966] Generating core_cplrnt.c [468/966] Generating core_dplrnt.c [469/966] Generating cpucblk_smpi_rhs_fwd.c [470/966] Generating core_sgeadd.c [471/966] Generating cpucblk_zmpi_rhs_fwd.c [472/966] Generating core_zgeadd.c [473/966] Generating core_zplrnt.c [474/966] Generating solve_dtrsmsp.c [475/966] Generating solve_ctrsmsp.c [476/966] Generating solve_strsmsp.c [477/966] Generating d_spm_tests.c [478/966] Generating s_spm_tests.c [479/966] Generating z_spm_tests.c [480/966] Generating solve_ztrsmsp.c [481/966] Generating d_spm_sort_tests.c [482/966] Generating s_spm_sort_tests.c [483/966] Generating z_spm_sort_tests.c [484/966] Generating c_spm_tests.c [485/966] Generating c_spm_sort_tests.c [486/966] Generating core_dgelrops.c [487/966] Building C object spm/tests/CMakeFiles/spm_test.dir/d_spm_tests.c.o [488/966] Building C object spm/tests/CMakeFiles/spm_test.dir/c_spm_tests.c.o [489/966] Building C object spm/tests/CMakeFiles/spm_test.dir/s_spm_tests.c.o [490/966] Building C object spm/tests/CMakeFiles/spm_test.dir/z_spm_tests.c.o [491/966] Building C object spm/tests/CMakeFiles/spm_test.dir/core_dgeadd.c.o [492/966] Building C object spm/tests/CMakeFiles/spm_test.dir/core_cgeadd.c.o [493/966] Building C object spm/tests/CMakeFiles/spm_test.dir/core_sgeadd.c.o [494/966] Building C object spm/tests/CMakeFiles/spm_test.dir/core_zgeadd.c.o [495/966] Building C object spm/tests/CMakeFiles/spm_test.dir/d_spm_sort_tests.c.o [496/966] Building C object spm/tests/CMakeFiles/spm_test.dir/c_spm_sort_tests.c.o [497/966] Building C object spm/tests/CMakeFiles/spm_test.dir/s_spm_sort_tests.c.o [498/966] Building C object spm/tests/CMakeFiles/spm_test.dir/z_spm_sort_tests.c.o [499/966] Building C object spm/tests/CMakeFiles/spm_test.dir/p_spm_tests.c.o [500/966] Building C object spm/tests/CMakeFiles/spm_test.dir/spm_test_compare.c.o [501/966] Building C object spm/tests/CMakeFiles/spm_test.dir/spm_test_utils.c.o [502/966] Building C object spm/tests/CMakeFiles/spm_test.dir/get_options.c.o [503/966] Building C object spm/tests/CMakeFiles/spm_convert_tests.dir/spm_convert_tests.c.o [504/966] Building C object spm/tests/CMakeFiles/spm_norm_tests.dir/spm_norm_tests.c.o [505/966] Building C object spm/tests/CMakeFiles/spm_matvec_tests.dir/spm_matvec_tests.c.o [506/966] Building C object spm/tests/CMakeFiles/spm_expand_tests.dir/spm_expand_tests.c.o [507/966] Building C object spm/tests/CMakeFiles/spm_sort_tests.dir/spm_sort_tests.c.o [508/966] Building C object spm/tests/CMakeFiles/spm_check_and_correct_tests.dir/spm_check_and_correct_tests.c.o [509/966] Building C object spm/tests/CMakeFiles/spm_scatter_gather_tests.dir/spm_scatter_gather_tests.c.o [510/966] Building C object spm/tests/CMakeFiles/spm_dist_convert_tests.dir/spm_dist_convert_tests.c.o [511/966] Building C object spm/tests/CMakeFiles/spm_dist_norm_tests.dir/spm_dist_norm_tests.c.o [512/966] Building C object spm/tests/CMakeFiles/spm_dist_genrhs_tests.dir/spm_dist_genrhs_tests.c.o [513/966] Building C object spm/tests/CMakeFiles/spm_dist_matvec_tests.dir/spm_dist_matvec_tests.c.o [514/966] Building C object spm/tests/CMakeFiles/spm_dist_sort_tests.dir/spm_dist_sort_tests.c.o [515/966] Building C object spm/tests/CMakeFiles/spm_dist_check_and_correct_tests.dir/spm_dist_check_and_correct_tests.c.o [516/966] Building C object spm/tests/CMakeFiles/spm_redistribute_tests.dir/spm_redistribute_tests.c.o [517/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dgemdm.c.o [518/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_cgemdm.c.o [519/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_sgemdm.c.o [520/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zgemdm.c.o [521/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dgetmo.c.o [522/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_cgetmo.c.o [523/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_sgetmo.c.o [524/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zgetmo.c.o [525/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dgeadd.c.o [526/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_cgeadd.c.o [527/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_sgeadd.c.o [528/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zgeadd.c.o [529/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dtradd.c.o [530/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_ctradd.c.o [531/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_stradd.c.o [532/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_ztradd.c.o [533/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dgemmsp.c.o [534/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_cgemmsp.c.o [535/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_sgemmsp.c.o [536/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zgemmsp.c.o [537/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dtrsmsp.c.o [538/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_ctrsmsp.c.o [539/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_strsmsp.c.o [540/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_ztrsmsp.c.o [541/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dscalo.c.o [542/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_cscalo.c.o [543/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_sscalo.c.o [544/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zscalo.c.o [545/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dsytrfsp.c.o [546/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_csytrfsp.c.o [547/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_ssytrfsp.c.o [548/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zsytrfsp.c.o [549/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_chetrfsp.c.o [550/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zhetrfsp.c.o [551/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dpotrfsp.c.o [552/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_cpotrfsp.c.o [553/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_spotrfsp.c.o [554/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zpotrfsp.c.o [555/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_cpxtrfsp.c.o [556/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zpxtrfsp.c.o [557/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dgetrfsp.c.o [558/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_cgetrfsp.c.o [559/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_sgetrfsp.c.o [560/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zgetrfsp.c.o [561/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dlrnrm.c.o [562/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_clrnrm.c.o [563/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_slrnrm.c.o [564/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zlrnrm.c.o [565/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dlrdbg.c.o [566/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_clrdbg.c.o [567/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_slrdbg.c.o [568/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zlrdbg.c.o [569/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dlr2xx.c.o [570/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_clr2xx.c.o [571/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_slr2xx.c.o [572/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zlr2xx.c.o [573/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dxx2lr.c.o [574/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_cxx2lr.c.o [575/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_sxx2lr.c.o [576/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zxx2lr.c.o [577/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dxx2fr.c.o [578/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_cxx2fr.c.o [579/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_sxx2fr.c.o [580/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zxx2fr.c.o [581/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dlrmm.c.o [582/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_clrmm.c.o [583/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_slrmm.c.o [584/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zlrmm.c.o [585/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dpqrcp.c.o [586/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_cpqrcp.c.o [587/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_spqrcp.c.o [588/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zpqrcp.c.o [589/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_drqrcp.c.o [590/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_crqrcp.c.o [591/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_srqrcp.c.o [592/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zrqrcp.c.o [593/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dtqrcp.c.o [594/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_ctqrcp.c.o [595/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_stqrcp.c.o [596/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_ztqrcp.c.o [597/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_drqrrt.c.o [598/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_crqrrt.c.o [599/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_srqrrt.c.o [600/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zrqrrt.c.o [601/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dlrothu.c.o [602/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_clrothu.c.o [603/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_slrothu.c.o [604/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zlrothu.c.o [605/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dgelrops_svd.c.o [606/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_cgelrops_svd.c.o [607/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_sgelrops_svd.c.o [608/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zgelrops_svd.c.o [609/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_dinit.c.o [610/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_cinit.c.o [611/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_sinit.c.o [612/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_zinit.c.o [613/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_dcompress.c.o [614/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_ccompress.c.o [615/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_scompress.c.o [616/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_zcompress.c.o [617/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_ddiff.c.o [618/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_cdiff.c.o [619/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_sdiff.c.o [620/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_zdiff.c.o [621/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_dadd.c.o [622/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_cadd.c.o [623/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_sadd.c.o [624/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_zadd.c.o [625/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_dschur.c.o [626/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_cschur.c.o [627/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_sschur.c.o [628/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_zschur.c.o [629/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpublok_dadd.c.o [630/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpublok_cadd.c.o [631/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpublok_sadd.c.o [632/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpublok_zadd.c.o [633/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_dmpi_coeftab.c.o [634/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_dpack.c.o [635/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_cpack.c.o [636/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_spack.c.o [637/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_zpack.c.o [638/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/solve_dtrsmsp.c.o [639/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/solve_ctrsmsp.c.o [640/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/solve_strsmsp.c.o [641/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/solve_ztrsmsp.c.o [642/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/kernels.c.o [643/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/kernels_trace.c.o [644/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/lowrank.c.o [645/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/queue.c.o [646/966] Building C object CMakeFiles/pastix.dir/graph/graph.c.o [647/966] Building C object CMakeFiles/pastix.dir/graph/graph_compute_projection.c.o [648/966] Building C object CMakeFiles/pastix.dir/graph/graph_connected_components.c.o [649/966] Building C object CMakeFiles/pastix.dir/graph/graph_io.c.o [650/966] Building C object CMakeFiles/pastix.dir/graph/graph_isolate.c.o [651/966] Building C object CMakeFiles/pastix.dir/graph/graph_prepare.c.o [652/966] Building C object CMakeFiles/pastix.dir/order/order.c.o [653/966] Building C object CMakeFiles/pastix.dir/order/order_add_isolate.c.o [654/966] Building C object CMakeFiles/pastix.dir/order/order_amalgamate.c.o [655/966] Building C object CMakeFiles/pastix.dir/order/order_apply_level_order.c.o [656/966] Building C object CMakeFiles/pastix.dir/order/order_check.c.o [657/966] Building C object CMakeFiles/pastix.dir/order/order_compute_personal.c.o [658/966] Building C object CMakeFiles/pastix.dir/order/order_find_supernodes.c.o [659/966] Building C object CMakeFiles/pastix.dir/order/order_grids.c.o [660/966] Building C object CMakeFiles/pastix.dir/order/order_io.c.o [661/966] Building C object CMakeFiles/pastix.dir/order/pastix_subtask_order.c.o [662/966] Building C object CMakeFiles/pastix.dir/symbol/fax_csr.c.o [663/966] Building C object CMakeFiles/pastix.dir/symbol/fax_csr_direct.c.o [664/966] Building C object CMakeFiles/pastix.dir/symbol/fax_csr_iluk.c.o [665/966] Building C object CMakeFiles/pastix.dir/symbol/symbol_base.c.o [666/966] Building C object CMakeFiles/pastix.dir/symbol/symbol_check.c.o [667/966] Building C object CMakeFiles/pastix.dir/symbol/symbol_cost.c.o [668/966] Building C object CMakeFiles/pastix.dir/symbol/symbol_cost_flops.c.o [669/966] Building C object CMakeFiles/pastix.dir/symbol/symbol_cost_perfs.c.o [670/966] Building C object CMakeFiles/pastix.dir/symbol/symbol_draw.c.o [671/966] Building C object CMakeFiles/pastix.dir/symbol/symbol_draw_map.c.o [672/966] Building C object CMakeFiles/pastix.dir/symbol/symbol_expand.c.o [673/966] Building C object CMakeFiles/pastix.dir/symbol/symbol_io.c.o [674/966] Building C object CMakeFiles/pastix.dir/symbol/symbol_reorder.c.o [675/966] Building C object CMakeFiles/pastix.dir/symbol/pastix_subtask_reordering.c.o [676/966] Building C object CMakeFiles/pastix.dir/symbol/pastix_subtask_symbfact.c.o [677/966] Building C object CMakeFiles/pastix.dir/blend/pastix_task_analyze.c.o [678/966] Building C object CMakeFiles/pastix.dir/blend/blendctrl.c.o [679/966] Building C object CMakeFiles/pastix.dir/blend/cost.c.o [680/966] Building C object CMakeFiles/pastix.dir/blend/extendVector.c.o [681/966] Building C object CMakeFiles/pastix.dir/blend/simu_task.c.o [682/966] Building C object CMakeFiles/pastix.dir/common/getline.c.o [683/966] Building C object spm/tests/CMakeFiles/spm_test.dir/core_dplrnt.c.o /build/pastix/src/build/spm/tests/core_dplrnt.c: In function ‘Rnd64_jump’: /build/pastix/src/build/spm/tests/core_dplrnt.c:30:7: warning: variable ‘i’ set but not used [-Wunused-but-set-variable=] 30 | int i; | ^ [684/966] Building C object spm/tests/CMakeFiles/spm_test.dir/core_cplrnt.c.o /build/pastix/src/build/spm/tests/core_cplrnt.c: In function ‘Rnd64_jump’: /build/pastix/src/build/spm/tests/core_cplrnt.c:30:7: warning: variable ‘i’ set but not used [-Wunused-but-set-variable=] 30 | int i; | ^ [685/966] Building C object spm/tests/CMakeFiles/spm_test.dir/core_splrnt.c.o /build/pastix/src/build/spm/tests/core_splrnt.c: In function ‘Rnd64_jump’: /build/pastix/src/build/spm/tests/core_splrnt.c:30:7: warning: variable ‘i’ set but not used [-Wunused-but-set-variable=] 30 | int i; | ^ [686/966] Building C object spm/tests/CMakeFiles/spm_test.dir/core_zplrnt.c.o /build/pastix/src/build/spm/tests/core_zplrnt.c: In function ‘Rnd64_jump’: /build/pastix/src/build/spm/tests/core_zplrnt.c:30:7: warning: variable ‘i’ set but not used [-Wunused-but-set-variable=] 30 | int i; | ^ [687/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dplrnt.c.o /build/pastix/src/build/kernels/core_dplrnt.c: In function ‘Rnd64_jump’: /build/pastix/src/build/kernels/core_dplrnt.c:32:7: warning: variable ‘i’ set but not used [-Wunused-but-set-variable=] 32 | int i; | ^ [688/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_cplrnt.c.o /build/pastix/src/build/kernels/core_cplrnt.c: In function ‘Rnd64_jump’: /build/pastix/src/build/kernels/core_cplrnt.c:32:7: warning: variable ‘i’ set but not used [-Wunused-but-set-variable=] 32 | int i; | ^ [689/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_splrnt.c.o /build/pastix/src/build/kernels/core_splrnt.c: In function ‘Rnd64_jump’: /build/pastix/src/build/kernels/core_splrnt.c:32:7: warning: variable ‘i’ set but not used [-Wunused-but-set-variable=] 32 | int i; | ^ [690/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zplrnt.c.o /build/pastix/src/build/kernels/core_zplrnt.c: In function ‘Rnd64_jump’: /build/pastix/src/build/kernels/core_zplrnt.c:32:7: warning: variable ‘i’ set but not used [-Wunused-but-set-variable=] 32 | int i; | ^ [691/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_dgelrops.c.o [692/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_cgelrops.c.o [693/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_sgelrops.c.o [694/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/core_zgelrops.c.o [695/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_cmpi_coeftab.c.o [696/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_smpi_coeftab.c.o [697/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_zmpi_coeftab.c.o [698/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_dmpi_rhs_fwd.c.o [699/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_cmpi_rhs_fwd.c.o [700/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_smpi_rhs_fwd.c.o [701/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_zmpi_rhs_fwd.c.o [702/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_dmpi_rhs_bwd.c.o [703/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_cmpi_rhs_bwd.c.o [704/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_smpi_rhs_bwd.c.o [705/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_zmpi_rhs_bwd.c.o [706/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_dsinit.c.o [707/966] Building C object CMakeFiles/pastix.dir/symbol/symbol.c.o [708/966] Building C object CMakeFiles/pastix.dir/symbol/symbol_fax_iluk.c.o [709/966] Building C object CMakeFiles/pastix.dir/blend/pastix_subtask_blend.c.o [710/966] Building C object CMakeFiles/pastix.dir/blend/cand_gendot.c.o [711/966] Building C object CMakeFiles/pastix.dir/blend/elimintree.c.o [712/966] Building C object CMakeFiles/pastix.dir/blend/simu.c.o [713/966] Building C object CMakeFiles/pastix.dir/blend/solver_backup.c.o [714/966] Building C object CMakeFiles/pastix.dir/blend/propmap.c.o [715/966] Building C object CMakeFiles/pastix.dir/blend/solver_copy.c.o [716/966] Building C object CMakeFiles/pastix.dir/blend/cand.c.o [717/966] Building C object CMakeFiles/pastix.dir/symbol/fax_csr_amalgamate.c.o [718/966] Building C object CMakeFiles/pastix.dir/symbol/symbol_fax_direct.c.o [719/966] Building C object CMakeFiles/pastix.dir/common/get_options.c.o [720/966] Building C object CMakeFiles/pastix.dir/blend/extracblk.c.o [721/966] Building C object CMakeFiles/pastix.dir/sopalin/diag.c.o [722/966] Building C object CMakeFiles/pastix.dir/symbol/symbol_reordering.c.o [723/966] Building C object CMakeFiles/pastix.dir/blend/solver_check.c.o [724/966] Building C object CMakeFiles/pastix.dir/blend/solver_draw.c.o [725/966] Building C object CMakeFiles/pastix.dir/blend/solver_recv.c.o [726/966] Building Fortran preprocessed wrappers/fortran90/CMakeFiles/pastixf.dir/src/pastixf_bindings.f90-pp.f90 [727/966] Building C object CMakeFiles/pastix.dir/common/isched.c.o [728/966] Building Fortran preprocessed wrappers/fortran90/CMakeFiles/pastixf.dir/src/pastixf_functions.f90-pp.f90 [729/966] Building C object CMakeFiles/pastix.dir/blend/solver_io.c.o [730/966] Building Fortran preprocessed wrappers/fortran90/CMakeFiles/pastixf.dir/src/pastixf_interfaces.f90-pp.f90 [731/966] Building Fortran preprocessed wrappers/fortran90/CMakeFiles/pastixf.dir/src/pastixf_enums.F90-pp.f90 [732/966] Building Fortran preprocessed wrappers/fortran90/CMakeFiles/pastixf.dir/src/pastixf.f90-pp.f90 [733/966] Building Fortran preprocessed wrappers/fortran90/CMakeFiles/fsimple.dir/examples/fsimple.F90-pp.f90 [734/966] Building C object CMakeFiles/pastix.dir/common/check_options.c.o [735/966] Building Fortran preprocessed wrappers/fortran90/CMakeFiles/fstep-by-step.dir/examples/fstep-by-step.F90-pp.f90 [736/966] Building Fortran preprocessed wrappers/fortran90/CMakeFiles/flaplacian.dir/examples/flaplacian.F90-pp.f90 [737/966] Building Fortran preprocessed wrappers/fortran90/CMakeFiles/fmultidof.dir/examples/fmultidof.F90-pp.f90 [738/966] Building Fortran preprocessed wrappers/fortran90/CMakeFiles/fusermat_csr.dir/examples/fusermat_csr.F90-pp.f90 [739/966] Building Fortran preprocessed wrappers/fortran90/CMakeFiles/fmultilap.dir/examples/fmultilap.F90-pp.f90 [740/966] Building C object kernels/CMakeFiles/pastix_kernels.dir/cpucblk_zcinit.c.o [741/966] Building C object CMakeFiles/pastix.dir/blend/splitsymbol.c.o [742/966] Building C object CMakeFiles/pastix.dir/common/parse_options.c.o [743/966] Building C object CMakeFiles/pastix.dir/common/d_integer.c.o [744/966] Building C object CMakeFiles/pastix.dir/common/s_integer.c.o [745/966] Building C object CMakeFiles/pastix.dir/common/models.c.o [746/966] Building C object CMakeFiles/pastix.dir/blend/solver.c.o [747/966] Building C object CMakeFiles/pastix.dir/common/c_integer.c.o [748/966] Building C object CMakeFiles/pastix.dir/common/z_integer.c.o [749/966] Building C object CMakeFiles/pastix.dir/sopalin/pastix.c.o [750/966] Building C object CMakeFiles/pastix.dir/refinement/pastix_task_refine.c.o [751/966] Building C object CMakeFiles/pastix.dir/blend/solver_matrix_gen.c.o [752/966] Building C object CMakeFiles/pastix.dir/bcsc/bcsc_dnorm.c.o [753/966] Generating Fortran dyndep file wrappers/fortran90/CMakeFiles/pastixf.dir/Fortran.dd [754/966] Building C object CMakeFiles/pastix.dir/bcsc/bcsc_cnorm.c.o [755/966] Building C object CMakeFiles/pastix.dir/bcsc/bcsc_snorm.c.o [756/966] Building C object CMakeFiles/pastix.dir/bcsc/bcsc_znorm.c.o [757/966] Building C object CMakeFiles/pastix.dir/blend/simu_run.c.o [758/966] Linking C executable spm/examples/example_drivers [759/966] Building C object CMakeFiles/pastix.dir/sopalin/schur.c.o [760/966] Building C object CMakeFiles/pastix.dir/common/api.c.o [761/966] Building C object CMakeFiles/pastix.dir/sopalin/coeftab.c.o [762/966] Building C object CMakeFiles/pastix.dir/bcsc/bcsc_sspmv.c.o [763/966] Building C object CMakeFiles/pastix.dir/sopalin/pastix_rhs.c.o [764/966] Building C object CMakeFiles/pastix.dir/bcsc/bcsc_dspmv.c.o [765/966] Building C object CMakeFiles/pastix.dir/bcsc/bcsc_zspmv.c.o [766/966] Building C object CMakeFiles/pastix.dir/bcsc/bvec.c.o [767/966] Generating Fortran dyndep file wrappers/fortran90/CMakeFiles/fsimple.dir/Fortran.dd [768/966] Generating Fortran dyndep file wrappers/fortran90/CMakeFiles/flaplacian.dir/Fortran.dd [769/966] Building C object CMakeFiles/pastix.dir/blend/solver_matrix_gen_utils.c.o [770/966] Generating Fortran dyndep file wrappers/fortran90/CMakeFiles/fmultidof.dir/Fortran.dd [771/966] Generating Fortran dyndep file wrappers/fortran90/CMakeFiles/fstep-by-step.dir/Fortran.dd [772/966] Generating Fortran dyndep file wrappers/fortran90/CMakeFiles/fusermat_csr.dir/Fortran.dd [773/966] Building C object CMakeFiles/pastix.dir/bcsc/bcsc_cspmv.c.o [774/966] Generating Fortran dyndep file wrappers/fortran90/CMakeFiles/fmultilap.dir/Fortran.dd [775/966] Building C object CMakeFiles/pastix.dir/common/integer.c.o [776/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_ddiag.c.o [777/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_dgetrf.c.o [778/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_zdiag.c.o [779/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_cdiag.c.o [780/966] Linking Fortran shared library spm/wrappers/fortran90/libspmf.so.1.2.4 [781/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_sdiag.c.o [782/966] Building C object CMakeFiles/pastix.dir/sopalin/pastix_task_solve.c.o [783/966] Building C object CMakeFiles/pastix.dir/sopalin/pastix_task_sopalin.c.o [784/966] Building C object CMakeFiles/pastix.dir/sopalin/coeftab_cinit.c.o [785/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_zgetrf.c.o [786/966] Building C object CMakeFiles/pastix.dir/refinement/c_refine_functions.c.o [787/966] Building C object CMakeFiles/pastix.dir/refinement/d_refine_functions.c.o [788/966] Building C object CMakeFiles/pastix.dir/refinement/z_refine_functions.c.o [789/966] Building C object CMakeFiles/pastix.dir/refinement/s_refine_functions.c.o [790/966] Building C object CMakeFiles/pastix.dir/sopalin/coeftab_zinit.c.o [791/966] Building C object CMakeFiles/pastix.dir/sopalin/coeftab_sinit.c.o [792/966] Building C object CMakeFiles/pastix.dir/refinement/c_refine_bicgstab.c.o [793/966] Building C object CMakeFiles/pastix.dir/refinement/d_refine_bicgstab.c.o [794/966] Building C object CMakeFiles/pastix.dir/sopalin/coeftab_dinit.c.o [795/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_sgetrf.c.o [796/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_cgetrf.c.o [797/966] Building C object CMakeFiles/pastix.dir/refinement/s_refine_bicgstab.c.o [798/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_cpotrf.c.o [799/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_spotrf.c.o [800/966] Building C object CMakeFiles/pastix.dir/refinement/z_refine_bicgstab.c.o [801/966] Building C object CMakeFiles/pastix.dir/bcsc/bcsc_sinit.c.o [802/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_cpxtrf.c.o [803/966] Building C object CMakeFiles/pastix.dir/bcsc/bcsc.c.o [804/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_dpotrf.c.o [805/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_zhetrf.c.o [806/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_chetrf.c.o [807/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_zpxtrf.c.o [808/966] Building C object CMakeFiles/pastix.dir/bcsc/bcsc_dinit.c.o [809/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_zpotrf.c.o [810/966] Building C object CMakeFiles/pastix.dir/bcsc/bvec_dmpi_comm.c.o [811/966] Building C object CMakeFiles/pastix.dir/sopalin/coeftab_zcinit.c.o [812/966] Creating library symlink spm/wrappers/fortran90/libspmf.so.1 spm/wrappers/fortran90/libspmf.so [813/966] Building C object CMakeFiles/pastix.dir/bcsc/bcsc_cinit.c.o [814/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_ssytrf.c.o [815/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_zsytrf.c.o [816/966] Building C object CMakeFiles/pastix.dir/sopalin/coeftab_dsinit.c.o [817/966] Building C object CMakeFiles/pastix.dir/refinement/d_refine_grad.c.o [818/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_dsytrf.c.o [819/966] Building C object CMakeFiles/pastix.dir/bcsc/bvec_smpi_comm.c.o [820/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_csytrf.c.o [821/966] Building C object CMakeFiles/pastix.dir/refinement/s_refine_grad.c.o [822/966] Building C object example/CMakeFiles/analyze.dir/analyze.c.o [823/966] Building C object CMakeFiles/pastix.dir/bcsc/bvec_zmpi_comm.c.o [824/966] Building C object CMakeFiles/pastix.dir/bcsc/bvec_cmpi_comm.c.o [825/966] Linking C executable spm/examples/example_lap2 [826/966] Building C object CMakeFiles/pastix.dir/refinement/c_refine_grad.c.o [827/966] Building C object CMakeFiles/pastix.dir/bcsc/bcsc_zinit.c.o [828/966] Building C object CMakeFiles/pastix.dir/refinement/c_refine_pivot.c.o [829/966] Building C object CMakeFiles/pastix.dir/refinement/s_refine_pivot.c.o [830/966] Building C object CMakeFiles/pastix.dir/refinement/d_refine_pivot.c.o [831/966] Building C object CMakeFiles/pastix.dir/refinement/z_refine_grad.c.o [832/966] Building C object CMakeFiles/pastix.dir/refinement/d_refine_gmres.c.o [833/966] Building C object CMakeFiles/pastix.dir/order/order_scotch_common.c.o [834/966] Linking C executable spm/examples/example_lap1 [835/966] Building C object CMakeFiles/pastix.dir/bcsc/bvec_dcompute.c.o [836/966] Building C object CMakeFiles/pastix.dir/bcsc/bvec_scompute.c.o [837/966] Building C object CMakeFiles/pastix.dir/bcsc/bvec_clapmr.c.o [838/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_dtrsm.c.o [839/966] Building C object CMakeFiles/pastix.dir/refinement/s_refine_gmres.c.o [840/966] Building C object example/CMakeFiles/ordering_grid.dir/ordering_grid.c.o [841/966] Building C object CMakeFiles/pastix.dir/refinement/z_refine_pivot.c.o [842/966] Building C object CMakeFiles/pastix.dir/graph/graph_compute_kway.c.o [843/966] Building C object CMakeFiles/pastix.dir/refinement/c_refine_gmres.c.o [844/966] Building C object CMakeFiles/pastix.dir/bcsc/bvec_zlapmr.c.o [845/966] Building C object CMakeFiles/pastix.dir/bcsc/bvec_dlapmr.c.o [846/966] Building C object CMakeFiles/pastix.dir/bcsc/bvec_slapmr.c.o [847/966] Building C object wrappers/fortran90/CMakeFiles/pastixf.dir/src/pastix_f2c.c.o In file included from /build/pastix/src/pastix-6.4.0/spm/include/spm.h:25, from /build/pastix/src/pastix-6.4.0/include/pastix.h:27, from /build/pastix/src/pastix-6.4.0/common/common.h:24, from /build/pastix/src/pastix-6.4.0/wrappers/fortran90/src/pastix_f2c.c:19: /build/pastix/src/build/spm/include/spm/config.h:25:9: warning: ‘SPM_WITH_MPI’ redefined 25 | #define SPM_WITH_MPI | ^~~~~~~~~~~~ : note: this is the location of the previous definition [848/966] Building C object CMakeFiles/pastix.dir/refinement/z_refine_gmres.c.o [849/966] Building C object CMakeFiles/pastix.dir/order/order_compute_scotch.c.o [850/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_ztrsm.c.o [851/966] Building C object example/CMakeFiles/compress.dir/compress.c.o [852/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_ctrsm.c.o [853/966] Building C object example/CMakeFiles/bench_facto.dir/bench_facto.c.o [854/966] Building C object example/CMakeFiles/pastix_benchmark.dir/pastix_benchmark.c.o [855/966] Building C object example/CMakeFiles/multidof.dir/multidof.c.o [856/966] Building C object CMakeFiles/pastix.dir/bcsc/bvec_ccompute.c.o [857/966] Building C object example/CMakeFiles/refinement.dir/refinement.c.o [858/966] Building C object example/CMakeFiles/simple_solve_and_refine.dir/simple_solve_and_refine.c.o [859/966] Building C object CMakeFiles/pastix.dir/bcsc/bvec_zcompute.c.o [860/966] Building C object example/CMakeFiles/simple.dir/simple.c.o [861/966] Building C object example/old/CMakeFiles/old_simple.dir/simple.c.o [862/966] Building C object CMakeFiles/pastix.dir/common/isched_hwloc.c.o [863/966] Building C object example/CMakeFiles/personal.dir/personal.c.o [864/966] Building C object example/CMakeFiles/simple_trans.dir/simple_trans.c.o [865/966] Building C object example/CMakeFiles/isolate.dir/isolate.c.o [866/966] Building C object CMakeFiles/pastix.dir/sopalin/sequential_strsm.c.o [867/966] Building C object example/CMakeFiles/step-by-step.dir/step-by-step.c.o [868/966] Linking C executable spm/examples/example_mdof2 [869/966] Building C object CMakeFiles/pastix.dir/sopalin/coeftab_s.c.o [870/966] Building C object example/CMakeFiles/reentrant.dir/reentrant.c.o [871/966] Building C object example/old/CMakeFiles/old_step-by-step.dir/step-by-step.c.o [872/966] Building C object CMakeFiles/pastix.dir/order/order_draw.c.o [873/966] Building C object CMakeFiles/pastix.dir/sopalin/coeftab_d.c.o [874/966] Building C object CMakeFiles/pastix.dir/sopalin/coeftab_c.c.o [875/966] Linking C executable spm/examples/example_mdof1 [876/966] Building C object CMakeFiles/pastix.dir/order/order_supernodes.c.o [877/966] Building C object CMakeFiles/pastix.dir/sopalin/coeftab_z.c.o [878/966] Linking Fortran executable spm/wrappers/fortran90/spmf_user [879/966] Linking Fortran executable spm/wrappers/fortran90/spmf_driver [880/966] Building C object example/CMakeFiles/schur.dir/schur.c.o [881/966] Linking Fortran executable spm/wrappers/fortran90/spmf_rebalance [882/966] Building C object example/CMakeFiles/dump_rank.dir/dump_rank.c.o [883/966] Building Fortran object wrappers/fortran90/CMakeFiles/pastixf.dir/src/pastixf_enums.F90.o [884/966] Building Fortran object wrappers/fortran90/CMakeFiles/pastixf.dir/src/pastixf_bindings.f90.o [885/966] Building Fortran object wrappers/fortran90/CMakeFiles/pastixf.dir/src/pastixf_interfaces.f90.o [886/966] Building Fortran object wrappers/fortran90/CMakeFiles/pastixf.dir/src/pastixf.f90.o [887/966] Building Fortran object wrappers/fortran90/CMakeFiles/fsimple.dir/examples/fsimple.F90.o [888/966] Building Fortran object wrappers/fortran90/CMakeFiles/fmultidof.dir/examples/fmultidof.F90.o [889/966] Building Fortran object wrappers/fortran90/CMakeFiles/fstep-by-step.dir/examples/fstep-by-step.F90.o [890/966] Building Fortran object wrappers/fortran90/CMakeFiles/flaplacian.dir/examples/flaplacian.F90.o [891/966] Building Fortran object wrappers/fortran90/CMakeFiles/fusermat_csr.dir/examples/fusermat_csr.F90.o [892/966] Building Fortran object wrappers/fortran90/CMakeFiles/pastixf.dir/src/pastixf_functions.f90.o [893/966] Building Fortran object wrappers/fortran90/CMakeFiles/fmultilap.dir/examples/fmultilap.F90.o [894/966] Linking C shared library spm/tests/libspm_test.so [895/966] Linking C executable spm/tests/spm_dist_norm_tests [896/966] Linking C executable spm/tests/spm_dist_sort_tests [897/966] Linking C executable spm/tests/spm_norm_tests [898/966] Linking C executable spm/tests/spm_expand_tests [899/966] Linking C executable spm/tests/spm_matvec_tests [900/966] Linking C executable spm/tests/spm_convert_tests [901/966] Linking C executable spm/tests/spm_dist_matvec_tests [902/966] Linking C executable spm/tests/spm_dist_convert_tests [903/966] Linking C executable spm/tests/spm_check_and_correct_tests [904/966] Linking C executable spm/tests/spm_redistribute_tests [905/966] Linking C executable spm/tests/spm_sort_tests [906/966] Linking C executable spm/tests/spm_dist_check_and_correct_tests [907/966] Linking C executable spm/tests/spm_dist_genrhs_tests [908/966] Linking C executable spm/tests/spm_scatter_gather_tests [909/966] Linking C shared library kernels/libpastix_kernels.so [910/966] Linking C shared library libpastix.so.6.4.0 [911/966] Creating library symlink libpastix.so.6.4 libpastix.so [912/966] Generating c_bcsc_tests.c [913/966] Generating z_bcsc_tests.c [914/966] Generating s_bcsc_tests.c [915/966] Generating d_bcsc_tests.c [916/966] Generating c_bvec_tests.c [917/966] Generating s_bvec_tests.c [918/966] Generating z_bvec_tests.c [919/966] Generating d_bvec_tests.c [920/966] Building C object test/CMakeFiles/bcsc_spmv_time.dir/bcsc_spmv_time.c.o [921/966] Building C object test/CMakeFiles/bcsc_norm_tests.dir/bcsc_norm_tests.c.o [922/966] Building C object test/CMakeFiles/bcsc_spmv_tests.dir/bcsc_spmv_tests.c.o [923/966] Building C object test/CMakeFiles/bvec_tests.dir/bvec_tests.c.o [924/966] Building C object test/CMakeFiles/bvec_applyorder_tests.dir/bvec_applyorder_tests.c.o [925/966] Building C object test/CMakeFiles/bcsc_test.dir/d_bcsc_tests.c.o [926/966] Building C object test/CMakeFiles/bcsc_test.dir/c_bcsc_tests.c.o [927/966] Building C object test/CMakeFiles/bcsc_test.dir/s_bcsc_tests.c.o [928/966] Building C object test/CMakeFiles/bcsc_test.dir/z_bcsc_tests.c.o [929/966] Building C object test/CMakeFiles/bvec_gemv_tests.dir/bvec_gemv_tests.c.o [930/966] Building C object test/CMakeFiles/bcsc_test.dir/z_bvec_tests.c.o [931/966] Building C object test/CMakeFiles/bcsc_test.dir/d_bvec_tests.c.o [932/966] Building C object test/CMakeFiles/bcsc_test.dir/s_bvec_tests.c.o [933/966] Building C object test/CMakeFiles/bcsc_test.dir/c_bvec_tests.c.o [934/966] Linking C executable example/analyze [935/966] Linking C executable example/simple [936/966] Linking C executable example/refinement [937/966] Linking C executable example/bench_facto [938/966] Linking C executable example/simple_solve_and_refine [939/966] Linking C executable example/old/old_simple [940/966] Linking C executable example/ordering_grid [941/966] Linking C executable example/compress [942/966] Linking C executable example/personal [943/966] Linking C executable example/pastix_benchmark [944/966] Linking C executable example/simple_trans [945/966] Linking C executable example/multidof [946/966] Linking C executable example/step-by-step [947/966] Linking C executable example/isolate [948/966] Linking Fortran shared library wrappers/fortran90/libpastixf.so.6.4.0 [949/966] Linking C executable example/old/old_step-by-step [950/966] Linking C executable example/dump_rank [951/966] Linking C executable example/reentrant [952/966] Creating library symlink wrappers/fortran90/libpastixf.so.6 wrappers/fortran90/libpastixf.so [953/966] Linking C executable example/schur [954/966] Linking Fortran executable wrappers/fortran90/fsimple [955/966] Linking Fortran executable wrappers/fortran90/flaplacian [956/966] Linking Fortran executable wrappers/fortran90/fstep-by-step [957/966] Linking Fortran executable wrappers/fortran90/fmultidof [958/966] Linking Fortran executable wrappers/fortran90/fusermat_csr [959/966] Linking Fortran executable wrappers/fortran90/fmultilap [960/966] Linking C shared library test/libbcsc_test.so [961/966] Linking C executable test/bcsc_spmv_time [962/966] Linking C executable test/bcsc_norm_tests [963/966] Linking C executable test/bvec_gemv_tests [964/966] Linking C executable test/bvec_tests [965/966] Linking C executable test/bcsc_spmv_tests [966/966] Linking C executable test/bvec_applyorder_tests ==> Starting check()... Test project /build/pastix/src/build Start 1: c_shm_example_analyze_lap_s_facto0 Start 2: c_shm_example_analyze_lap_s_facto1 Start 3: c_shm_example_analyze_lap_s_facto2 Start 4: c_shm_example_analyze_lap_d_facto0 Start 5: c_shm_example_analyze_lap_d_facto1 Start 6: c_shm_example_analyze_lap_d_facto2 Start 7: c_shm_example_analyze_lap_c_facto0 Start 8: c_shm_example_analyze_lap_c_facto1 Start 9: c_shm_example_analyze_lap_c_facto2 Start 10: c_shm_example_analyze_lap_c_facto3 Start 11: c_shm_example_analyze_lap_c_facto4 Start 12: c_shm_example_analyze_lap_z_facto0 Start 13: c_shm_example_analyze_lap_z_facto1 Start 14: c_shm_example_analyze_lap_z_facto2 Start 15: c_shm_example_analyze_lap_z_facto3 Start 16: c_shm_example_analyze_lap_z_facto4 Start 17: c_shm_example_simple_lap_s_facto0 Start 18: c_shm_example_simple_lap_s_facto1 Start 19: c_shm_example_simple_lap_s_facto2 Start 20: c_shm_example_simple_lap_d_facto0 Start 21: c_shm_example_simple_lap_d_facto1 Start 22: c_shm_example_simple_lap_d_facto2 Start 23: c_shm_example_simple_lap_c_facto0 Start 24: c_shm_example_simple_lap_c_facto1 Start 25: c_shm_example_simple_lap_c_facto2 Start 26: c_shm_example_simple_lap_c_facto3 Start 27: c_shm_example_simple_lap_c_facto4 Start 28: c_shm_example_simple_lap_z_facto0 Start 29: c_shm_example_simple_lap_z_facto1 Start 30: c_shm_example_simple_lap_z_facto2 Start 31: c_shm_example_simple_lap_z_facto3 Start 32: c_shm_example_simple_lap_z_facto4 Start 33: c_shm_example_simple_solve_and_refine_lap_s_facto0 Start 34: c_shm_example_simple_solve_and_refine_lap_s_facto1 Start 35: c_shm_example_simple_solve_and_refine_lap_s_facto2 Start 36: c_shm_example_simple_solve_and_refine_lap_d_facto0 Start 37: c_shm_example_simple_solve_and_refine_lap_d_facto1 Start 38: c_shm_example_simple_solve_and_refine_lap_d_facto2 Start 39: c_shm_example_simple_solve_and_refine_lap_c_facto0 Start 40: c_shm_example_simple_solve_and_refine_lap_c_facto1 Start 41: c_shm_example_simple_solve_and_refine_lap_c_facto2 Start 42: c_shm_example_simple_solve_and_refine_lap_c_facto3 Start 43: c_shm_example_simple_solve_and_refine_lap_c_facto4 Start 44: c_shm_example_simple_solve_and_refine_lap_z_facto0 Start 45: c_shm_example_simple_solve_and_refine_lap_z_facto1 Start 46: c_shm_example_simple_solve_and_refine_lap_z_facto2 Start 47: c_shm_example_simple_solve_and_refine_lap_z_facto3 Start 48: c_shm_example_simple_solve_and_refine_lap_z_facto4 Start 49: c_shm_example_simple_trans_lap_s_facto0 Start 50: c_shm_example_simple_trans_lap_s_facto1 Start 51: c_shm_example_simple_trans_lap_s_facto2 Start 52: c_shm_example_simple_trans_lap_d_facto0 Start 53: c_shm_example_simple_trans_lap_d_facto1 Start 54: c_shm_example_simple_trans_lap_d_facto2 Start 55: c_shm_example_simple_trans_lap_c_facto0 Start 56: c_shm_example_simple_trans_lap_c_facto1 Start 57: c_shm_example_simple_trans_lap_c_facto2 Start 58: c_shm_example_simple_trans_lap_c_facto3 Start 59: c_shm_example_simple_trans_lap_c_facto4 Start 60: c_shm_example_simple_trans_lap_z_facto0 Start 61: c_shm_example_simple_trans_lap_z_facto1 Start 62: c_shm_example_simple_trans_lap_z_facto2 Start 63: c_shm_example_simple_trans_lap_z_facto3 Start 64: c_shm_example_simple_trans_lap_z_facto4 Start 65: c_shm_example_step-by-step_lap_s_facto0 Start 66: c_shm_example_step-by-step_lap_s_facto1 Start 67: c_shm_example_step-by-step_lap_s_facto2 Start 68: c_shm_example_step-by-step_lap_d_facto0 Start 69: c_shm_example_step-by-step_lap_d_facto1 Start 70: c_shm_example_step-by-step_lap_d_facto2 Start 71: c_shm_example_step-by-step_lap_c_facto0 Start 72: c_shm_example_step-by-step_lap_c_facto1 Start 73: c_shm_example_step-by-step_lap_c_facto2 Start 74: c_shm_example_step-by-step_lap_c_facto3 Start 75: c_shm_example_step-by-step_lap_c_facto4 Start 76: c_shm_example_step-by-step_lap_z_facto0 Start 77: c_shm_example_step-by-step_lap_z_facto1 Start 78: c_shm_example_step-by-step_lap_z_facto2 Start 79: c_shm_example_step-by-step_lap_z_facto3 Start 80: c_shm_example_step-by-step_lap_z_facto4 Start 81: c_shm_example_schur_lap_s_facto0 Start 82: c_shm_example_schur_lap_s_facto2 Start 83: c_shm_example_schur_lap_d_facto0 Start 84: c_shm_example_schur_lap_d_facto2 Start 85: c_shm_example_schur_lap_c_facto0 Start 86: c_shm_example_schur_lap_c_facto2 Start 87: c_shm_example_schur_lap_z_facto0 Start 88: c_shm_example_schur_lap_z_facto2 Start 89: c_shm_example_personal_lap_s_facto0 Start 90: c_shm_example_personal_lap_s_facto1 Start 91: c_shm_example_personal_lap_s_facto2 Start 92: c_shm_example_personal_lap_d_facto0 Start 93: c_shm_example_personal_lap_d_facto1 Start 94: c_shm_example_personal_lap_d_facto2 Start 95: c_shm_example_personal_lap_c_facto0 Start 96: c_shm_example_personal_lap_c_facto1 Start 97: c_shm_example_personal_lap_c_facto2 Start 98: c_shm_example_personal_lap_c_facto3 Start 99: c_shm_example_personal_lap_c_facto4 Start 100: c_shm_example_personal_lap_z_facto0 Start 101: c_shm_example_personal_lap_z_facto1 Start 102: c_shm_example_personal_lap_z_facto2 Start 103: c_shm_example_personal_lap_z_facto3 Start 104: c_shm_example_personal_lap_z_facto4 Start 105: c_shm_example_reentrant_lap_s_facto0 Start 106: c_shm_example_reentrant_lap_s_facto1 Start 107: c_shm_example_reentrant_lap_s_facto2 Start 108: c_shm_example_reentrant_lap_d_facto0 Start 109: c_shm_example_reentrant_lap_d_facto1 Start 110: c_shm_example_reentrant_lap_d_facto2 Start 111: c_shm_example_reentrant_lap_c_facto0 Start 112: c_shm_example_reentrant_lap_c_facto1 Start 113: c_shm_example_reentrant_lap_c_facto2 Start 114: c_shm_example_reentrant_lap_c_facto3 Start 115: c_shm_example_reentrant_lap_c_facto4 Start 116: c_shm_example_reentrant_lap_z_facto0 Start 117: c_shm_example_reentrant_lap_z_facto1 Start 118: c_shm_example_reentrant_lap_z_facto2 Start 119: c_shm_example_reentrant_lap_z_facto3 Start 120: c_shm_example_reentrant_lap_z_facto4 Start 121: c_shm_example_simple_scotch_rsa Start 122: c_shm_example_simple_scotch_mm Start 123: c_shm_example_simple_scotch_hb Start 124: c_shm_example_simple_scotch_mm2 Start 125: c_shm_example_simple_single_rsa Start 126: c_shm_example_simple_single_mm Start 127: c_shm_example_simple_single_hb Start 128: c_shm_example_simple_single_mm2 Start 129: c_shm_example_step-by-step_single_rsa Start 130: c_shm_example_step-by-step_single_mm Start 131: c_shm_example_step-by-step_single_hb Start 132: c_shm_example_step-by-step_single_mm2 Start 133: c_shm_example_simple_refine_cg Start 134: c_shm_example_simple_refine_gmres Start 135: c_shm_example_simple_refine_bicgstab Start 136: c_shm_example_refinement_lap_s_refine_cg_sym Start 137: c_shm_example_refinement_lap_s_refine_gmres_sym Start 138: c_shm_example_refinement_lap_s_refine_bicgstab_sym Start 139: c_shm_example_refinement_lap_d_refine_cg_sym Start 140: c_shm_example_refinement_lap_d_refine_gmres_sym Start 141: c_shm_example_refinement_lap_d_refine_bicgstab_sym Start 142: c_shm_example_refinement_lap_c_refine_cg_her Start 143: c_shm_example_refinement_lap_c_refine_gmres_her Start 144: c_shm_example_refinement_lap_c_refine_bicgstab_her Start 145: c_shm_example_refinement_lap_c_refine_cg_sym Start 146: c_shm_example_refinement_lap_c_refine_gmres_sym Start 147: c_shm_example_refinement_lap_c_refine_bicgstab_sym Start 148: c_shm_example_refinement_lap_z_refine_cg_her Start 149: c_shm_example_refinement_lap_z_refine_gmres_her Start 150: c_shm_example_refinement_lap_z_refine_bicgstab_her Start 151: c_shm_example_refinement_lap_z_refine_cg_sym Start 152: c_shm_example_refinement_lap_z_refine_gmres_sym Start 153: c_shm_example_refinement_lap_z_refine_bicgstab_sym Start 154: c_shm_example_simple_mixed_refine_cg Start 155: c_shm_example_simple_mixed_refine_gmres Start 156: c_shm_example_simple_mixed_refine_bicgstab Start 157: c_shm_example_simple_mixed_lap_d_facto0 Start 158: c_shm_example_simple_mixed_lap_d_facto1 Start 159: c_shm_example_simple_mixed_lap_d_facto2 Start 160: c_shm_example_simple_mixed_lap_d_refine_cg_sym Start 161: c_shm_example_simple_mixed_lap_d_refine_gmres_sym Start 162: c_shm_example_simple_mixed_lap_d_refine_bicgstab_sym Start 163: c_shm_example_simple_mixed_lap_z_facto0 Start 164: c_shm_example_simple_mixed_lap_z_facto1 Start 165: c_shm_example_simple_mixed_lap_z_facto2 Start 166: c_shm_example_simple_mixed_lap_z_facto3 Start 167: c_shm_example_simple_mixed_lap_z_facto4 Start 168: c_shm_example_simple_mixed_lap_z_refine_cg_her Start 169: c_shm_example_simple_mixed_lap_z_refine_gmres_her Start 170: c_shm_example_simple_mixed_lap_z_refine_bicgstab_her Start 171: c_shm_example_simple_mixed_lap_z_refine_cg_sym Start 172: c_shm_example_simple_mixed_lap_z_refine_gmres_sym Start 173: c_shm_example_simple_mixed_lap_z_refine_bicgstab_sym Start 174: shm_example_simple_lap_s_facto0_sched0_1d Start 175: shm_example_simple_lap_s_facto1_sched0_1d Start 176: shm_example_simple_lap_s_facto2_sched0_1d Start 177: shm_example_simple_lap_d_facto0_sched0_1d Start 178: shm_example_simple_lap_d_facto1_sched0_1d Start 179: shm_example_simple_lap_d_facto2_sched0_1d Start 180: shm_example_simple_lap_c_facto0_sched0_1d Start 181: shm_example_simple_lap_c_facto1_sched0_1d Start 182: shm_example_simple_lap_c_facto2_sched0_1d Start 183: shm_example_simple_lap_c_facto3_sched0_1d Start 184: shm_example_simple_lap_c_facto4_sched0_1d Start 185: shm_example_simple_lap_z_facto0_sched0_1d Start 186: shm_example_simple_lap_z_facto1_sched0_1d Start 187: shm_example_simple_lap_z_facto2_sched0_1d Start 188: shm_example_simple_lap_z_facto3_sched0_1d Start 189: shm_example_simple_lap_z_facto4_sched0_1d Start 190: shm_example_simple_lap_s_facto0_sched1_1d Start 191: shm_example_simple_lap_s_facto1_sched1_1d Start 192: shm_example_simple_lap_s_facto2_sched1_1d Start 193: shm_example_simple_lap_d_facto0_sched1_1d Start 194: shm_example_simple_lap_d_facto1_sched1_1d Start 195: shm_example_simple_lap_d_facto2_sched1_1d Start 196: shm_example_simple_lap_c_facto0_sched1_1d Start 197: shm_example_simple_lap_c_facto1_sched1_1d Start 198: shm_example_simple_lap_c_facto2_sched1_1d Start 199: shm_example_simple_lap_c_facto3_sched1_1d Start 200: shm_example_simple_lap_c_facto4_sched1_1d Start 201: shm_example_simple_lap_z_facto0_sched1_1d Start 202: shm_example_simple_lap_z_facto1_sched1_1d Start 203: shm_example_simple_lap_z_facto2_sched1_1d Start 204: shm_example_simple_lap_z_facto3_sched1_1d Start 205: shm_example_simple_lap_z_facto4_sched1_1d Start 206: shm_example_simple_lap_s_facto0_sched4_1d Start 207: shm_example_simple_lap_s_facto1_sched4_1d Start 208: shm_example_simple_lap_s_facto2_sched4_1d Start 209: shm_example_simple_lap_d_facto0_sched4_1d Start 210: shm_example_simple_lap_d_facto1_sched4_1d Start 211: shm_example_simple_lap_d_facto2_sched4_1d Start 212: shm_example_simple_lap_c_facto0_sched4_1d Start 213: shm_example_simple_lap_c_facto1_sched4_1d Start 214: shm_example_simple_lap_c_facto2_sched4_1d Start 215: shm_example_simple_lap_c_facto3_sched4_1d Start 216: shm_example_simple_lap_c_facto4_sched4_1d Start 217: shm_example_simple_lap_z_facto0_sched4_1d Start 218: shm_example_simple_lap_z_facto1_sched4_1d Start 219: shm_example_simple_lap_z_facto2_sched4_1d Start 220: shm_example_simple_lap_z_facto3_sched4_1d Start 221: shm_example_simple_lap_z_facto4_sched4_1d Start 222: shm_example_schur_lap_s_facto0_sched0_1d Start 223: shm_example_schur_lap_s_facto2_sched0_1d Start 224: shm_example_schur_lap_d_facto0_sched0_1d Start 225: shm_example_schur_lap_d_facto2_sched0_1d Start 226: shm_example_schur_lap_c_facto0_sched0_1d Start 227: shm_example_schur_lap_c_facto2_sched0_1d Start 228: shm_example_schur_lap_z_facto0_sched0_1d Start 229: shm_example_schur_lap_z_facto2_sched0_1d Start 230: shm_example_schur_lap_s_facto0_sched1_1d Start 231: shm_example_schur_lap_s_facto2_sched1_1d Start 232: shm_example_schur_lap_d_facto0_sched1_1d Start 233: shm_example_schur_lap_d_facto2_sched1_1d Start 234: shm_example_schur_lap_c_facto0_sched1_1d Start 235: shm_example_schur_lap_c_facto2_sched1_1d Start 236: shm_example_schur_lap_z_facto0_sched1_1d Start 237: shm_example_schur_lap_z_facto2_sched1_1d Start 238: shm_example_schur_lap_s_facto0_sched4_1d Start 239: shm_example_schur_lap_s_facto2_sched4_1d Start 240: shm_example_schur_lap_d_facto0_sched4_1d Start 241: shm_example_schur_lap_d_facto2_sched4_1d Start 242: shm_example_schur_lap_c_facto0_sched4_1d Start 243: shm_example_schur_lap_c_facto2_sched4_1d Start 244: shm_example_schur_lap_z_facto0_sched4_1d Start 245: shm_example_schur_lap_z_facto2_sched4_1d Start 246: shm_example_simple_lap_s_facto0_sched0_not_svdbegin Start 247: shm_example_simple_lap_s_facto0_sched0_not_svdend Start 248: shm_example_simple_lap_s_facto0_sched0_kway_svdbegin Start 249: shm_example_simple_lap_s_facto0_sched0_kway_svdend Start 250: shm_example_simple_lap_s_facto0_sched0_kwayprojections_svdbegin Start 251: shm_example_simple_lap_s_facto0_sched0_kwayprojections_svdend Start 252: shm_example_simple_lap_s_facto0_sched0_not_pqrcpbegin Start 253: shm_example_simple_lap_s_facto0_sched0_not_pqrcpend Start 254: shm_example_simple_lap_s_facto0_sched0_kway_pqrcpbegin Start 255: shm_example_simple_lap_s_facto0_sched0_kway_pqrcpend Start 256: shm_example_simple_lap_s_facto0_sched0_kwayprojections_pqrcpbegin Start 257: shm_example_simple_lap_s_facto0_sched0_kwayprojections_pqrcpend Start 258: shm_example_simple_lap_s_facto0_sched0_not_rqrcpbegin Start 259: shm_example_simple_lap_s_facto0_sched0_not_rqrcpend Start 260: shm_example_simple_lap_s_facto0_sched0_kway_rqrcpbegin Start 261: shm_example_simple_lap_s_facto0_sched0_kway_rqrcpend Start 262: shm_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpbegin Start 263: shm_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpend Start 264: shm_example_simple_lap_s_facto0_sched0_not_tqrcpbegin Start 265: shm_example_simple_lap_s_facto0_sched0_not_tqrcpend Start 266: shm_example_simple_lap_s_facto0_sched0_kway_tqrcpbegin Start 267: shm_example_simple_lap_s_facto0_sched0_kway_tqrcpend Start 268: shm_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpbegin Start 269: shm_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpend Start 270: shm_example_simple_lap_s_facto0_sched0_not_rqrrtbegin Start 271: shm_example_simple_lap_s_facto0_sched0_not_rqrrtend Start 272: shm_example_simple_lap_s_facto0_sched0_kway_rqrrtbegin Start 273: shm_example_simple_lap_s_facto0_sched0_kway_rqrrtend Start 274: shm_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtbegin Start 275: shm_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtend Start 276: shm_example_simple_lap_s_facto0_sched0_kway_pqrcpilu0 Start 277: shm_example_simple_lap_s_facto0_sched0_kway_pqrcpilu1 Start 278: shm_example_simple_lap_s_facto1_sched0_not_svdbegin Start 279: shm_example_simple_lap_s_facto1_sched0_not_svdend Start 280: shm_example_simple_lap_s_facto1_sched0_kway_svdbegin Start 281: shm_example_simple_lap_s_facto1_sched0_kway_svdend Start 282: shm_example_simple_lap_s_facto1_sched0_kwayprojections_svdbegin Start 283: shm_example_simple_lap_s_facto1_sched0_kwayprojections_svdend Start 284: shm_example_simple_lap_s_facto1_sched0_not_pqrcpbegin Start 285: shm_example_simple_lap_s_facto1_sched0_not_pqrcpend Start 286: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpbegin Start 287: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpend Start 288: shm_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpbegin Start 289: shm_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpend Start 290: shm_example_simple_lap_s_facto1_sched0_not_rqrcpbegin Start 291: shm_example_simple_lap_s_facto1_sched0_not_rqrcpend Start 292: shm_example_simple_lap_s_facto1_sched0_kway_rqrcpbegin Start 293: shm_example_simple_lap_s_facto1_sched0_kway_rqrcpend Start 294: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpbegin Start 295: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpend Start 296: shm_example_simple_lap_s_facto1_sched0_not_tqrcpbegin Start 297: shm_example_simple_lap_s_facto1_sched0_not_tqrcpend Start 298: shm_example_simple_lap_s_facto1_sched0_kway_tqrcpbegin Start 299: shm_example_simple_lap_s_facto1_sched0_kway_tqrcpend Start 300: shm_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpbegin Start 301: shm_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpend Start 302: shm_example_simple_lap_s_facto1_sched0_not_rqrrtbegin Start 303: shm_example_simple_lap_s_facto1_sched0_not_rqrrtend Start 304: shm_example_simple_lap_s_facto1_sched0_kway_rqrrtbegin Start 305: shm_example_simple_lap_s_facto1_sched0_kway_rqrrtend Start 306: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrrtbegin Start 307: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrrtend Start 308: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpilu0 Start 309: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpilu1 Start 310: shm_example_simple_lap_s_facto2_sched0_not_svdbegin Start 311: shm_example_simple_lap_s_facto2_sched0_not_svdend Start 312: shm_example_simple_lap_s_facto2_sched0_kway_svdbegin Start 313: shm_example_simple_lap_s_facto2_sched0_kway_svdend Start 314: shm_example_simple_lap_s_facto2_sched0_kwayprojections_svdbegin Start 315: shm_example_simple_lap_s_facto2_sched0_kwayprojections_svdend Start 316: shm_example_simple_lap_s_facto2_sched0_not_pqrcpbegin Start 317: shm_example_simple_lap_s_facto2_sched0_not_pqrcpend Start 318: shm_example_simple_lap_s_facto2_sched0_kway_pqrcpbegin Start 319: shm_example_simple_lap_s_facto2_sched0_kway_pqrcpend Start 320: shm_example_simple_lap_s_facto2_sched0_kwayprojections_pqrcpbegin Start 321: shm_example_simple_lap_s_facto2_sched0_kwayprojections_pqrcpend Start 322: shm_example_simple_lap_s_facto2_sched0_not_rqrcpbegin Start 323: shm_example_simple_lap_s_facto2_sched0_not_rqrcpend Start 324: shm_example_simple_lap_s_facto2_sched0_kway_rqrcpbegin Start 325: shm_example_simple_lap_s_facto2_sched0_kway_rqrcpend Start 326: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrcpbegin Start 327: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrcpend Start 328: shm_example_simple_lap_s_facto2_sched0_not_tqrcpbegin Start 329: shm_example_simple_lap_s_facto2_sched0_not_tqrcpend Start 330: shm_example_simple_lap_s_facto2_sched0_kway_tqrcpbegin Start 331: shm_example_simple_lap_s_facto2_sched0_kway_tqrcpend Start 332: shm_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpbegin Start 333: shm_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpend Start 334: shm_example_simple_lap_s_facto2_sched0_not_rqrrtbegin Start 335: shm_example_simple_lap_s_facto2_sched0_not_rqrrtend Start 336: shm_example_simple_lap_s_facto2_sched0_kway_rqrrtbegin Start 337: shm_example_simple_lap_s_facto2_sched0_kway_rqrrtend Start 338: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtbegin Start 339: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtend Start 340: shm_example_simple_lap_s_facto2_sched0_kway_pqrcpilu0 Start 341: shm_example_simple_lap_s_facto2_sched0_kway_pqrcpilu1 Start 342: shm_example_simple_lap_d_facto0_sched0_not_svdbegin Start 343: shm_example_simple_lap_d_facto0_sched0_not_svdend Start 344: shm_example_simple_lap_d_facto0_sched0_kway_svdbegin Start 345: shm_example_simple_lap_d_facto0_sched0_kway_svdend Start 346: shm_example_simple_lap_d_facto0_sched0_kwayprojections_svdbegin Start 347: shm_example_simple_lap_d_facto0_sched0_kwayprojections_svdend Start 348: shm_example_simple_lap_d_facto0_sched0_not_pqrcpbegin Start 349: shm_example_simple_lap_d_facto0_sched0_not_pqrcpend Start 350: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpbegin Start 351: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpend Start 352: shm_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpbegin Start 353: shm_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpend Start 354: shm_example_simple_lap_d_facto0_sched0_not_rqrcpbegin Start 355: shm_example_simple_lap_d_facto0_sched0_not_rqrcpend Start 356: shm_example_simple_lap_d_facto0_sched0_kway_rqrcpbegin Start 357: shm_example_simple_lap_d_facto0_sched0_kway_rqrcpend Start 358: shm_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpbegin Start 359: shm_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpend Start 360: shm_example_simple_lap_d_facto0_sched0_not_tqrcpbegin Start 361: shm_example_simple_lap_d_facto0_sched0_not_tqrcpend Start 362: shm_example_simple_lap_d_facto0_sched0_kway_tqrcpbegin Start 363: shm_example_simple_lap_d_facto0_sched0_kway_tqrcpend Start 364: shm_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpbegin Start 365: shm_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpend Start 366: shm_example_simple_lap_d_facto0_sched0_not_rqrrtbegin Start 367: shm_example_simple_lap_d_facto0_sched0_not_rqrrtend Start 368: shm_example_simple_lap_d_facto0_sched0_kway_rqrrtbegin Start 369: shm_example_simple_lap_d_facto0_sched0_kway_rqrrtend Start 370: shm_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtbegin Start 371: shm_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtend Start 372: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpilu0 Start 373: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpilu1 Start 374: shm_example_simple_lap_d_facto1_sched0_not_svdbegin Start 375: shm_example_simple_lap_d_facto1_sched0_not_svdend Start 376: shm_example_simple_lap_d_facto1_sched0_kway_svdbegin Start 377: shm_example_simple_lap_d_facto1_sched0_kway_svdend Start 378: shm_example_simple_lap_d_facto1_sched0_kwayprojections_svdbegin Start 379: shm_example_simple_lap_d_facto1_sched0_kwayprojections_svdend Start 380: shm_example_simple_lap_d_facto1_sched0_not_pqrcpbegin Start 381: shm_example_simple_lap_d_facto1_sched0_not_pqrcpend Start 382: shm_example_simple_lap_d_facto1_sched0_kway_pqrcpbegin Start 383: shm_example_simple_lap_d_facto1_sched0_kway_pqrcpend Start 384: shm_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpbegin Start 385: shm_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpend Start 386: shm_example_simple_lap_d_facto1_sched0_not_rqrcpbegin Start 387: shm_example_simple_lap_d_facto1_sched0_not_rqrcpend Start 388: shm_example_simple_lap_d_facto1_sched0_kway_rqrcpbegin Start 389: shm_example_simple_lap_d_facto1_sched0_kway_rqrcpend Start 390: shm_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpbegin Start 391: shm_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpend Start 392: shm_example_simple_lap_d_facto1_sched0_not_tqrcpbegin Start 393: shm_example_simple_lap_d_facto1_sched0_not_tqrcpend Start 394: shm_example_simple_lap_d_facto1_sched0_kway_tqrcpbegin Start 395: shm_example_simple_lap_d_facto1_sched0_kway_tqrcpend Start 396: shm_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpbegin Start 397: shm_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpend Start 398: shm_example_simple_lap_d_facto1_sched0_not_rqrrtbegin Start 399: shm_example_simple_lap_d_facto1_sched0_not_rqrrtend Start 400: shm_example_simple_lap_d_facto1_sched0_kway_rqrrtbegin Start 401: shm_example_simple_lap_d_facto1_sched0_kway_rqrrtend Start 402: shm_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtbegin Start 403: shm_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtend Start 404: shm_example_simple_lap_d_facto1_sched0_kway_pqrcpilu0 Start 405: shm_example_simple_lap_d_facto1_sched0_kway_pqrcpilu1 Start 406: shm_example_simple_lap_d_facto2_sched0_not_svdbegin Start 407: shm_example_simple_lap_d_facto2_sched0_not_svdend Start 408: shm_example_simple_lap_d_facto2_sched0_kway_svdbegin Start 409: shm_example_simple_lap_d_facto2_sched0_kway_svdend Start 410: shm_example_simple_lap_d_facto2_sched0_kwayprojections_svdbegin Start 411: shm_example_simple_lap_d_facto2_sched0_kwayprojections_svdend Start 412: shm_example_simple_lap_d_facto2_sched0_not_pqrcpbegin Start 413: shm_example_simple_lap_d_facto2_sched0_not_pqrcpend Start 414: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpbegin Start 415: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpend Start 416: shm_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpbegin Start 417: shm_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpend Start 418: shm_example_simple_lap_d_facto2_sched0_not_rqrcpbegin Start 419: shm_example_simple_lap_d_facto2_sched0_not_rqrcpend Start 420: shm_example_simple_lap_d_facto2_sched0_kway_rqrcpbegin Start 421: shm_example_simple_lap_d_facto2_sched0_kway_rqrcpend Start 422: shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpbegin Start 423: shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpend Start 424: shm_example_simple_lap_d_facto2_sched0_not_tqrcpbegin Start 425: shm_example_simple_lap_d_facto2_sched0_not_tqrcpend Start 426: shm_example_simple_lap_d_facto2_sched0_kway_tqrcpbegin Start 427: shm_example_simple_lap_d_facto2_sched0_kway_tqrcpend Start 428: shm_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpbegin Start 429: shm_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpend Start 430: shm_example_simple_lap_d_facto2_sched0_not_rqrrtbegin Start 431: shm_example_simple_lap_d_facto2_sched0_not_rqrrtend Start 432: shm_example_simple_lap_d_facto2_sched0_kway_rqrrtbegin Start 433: shm_example_simple_lap_d_facto2_sched0_kway_rqrrtend Start 434: shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtbegin Start 435: shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtend Start 436: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpilu0 Start 437: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpilu1 Start 438: shm_example_simple_lap_c_facto0_sched0_not_svdbegin Start 439: shm_example_simple_lap_c_facto0_sched0_not_svdend Start 440: shm_example_simple_lap_c_facto0_sched0_kway_svdbegin Start 441: shm_example_simple_lap_c_facto0_sched0_kway_svdend Start 442: shm_example_simple_lap_c_facto0_sched0_kwayprojections_svdbegin Start 443: shm_example_simple_lap_c_facto0_sched0_kwayprojections_svdend Start 444: shm_example_simple_lap_c_facto0_sched0_not_pqrcpbegin Start 445: shm_example_simple_lap_c_facto0_sched0_not_pqrcpend Start 446: shm_example_simple_lap_c_facto0_sched0_kway_pqrcpbegin Start 447: shm_example_simple_lap_c_facto0_sched0_kway_pqrcpend Start 448: shm_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpbegin Start 449: shm_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpend Start 450: shm_example_simple_lap_c_facto0_sched0_not_rqrcpbegin Start 451: shm_example_simple_lap_c_facto0_sched0_not_rqrcpend Start 452: shm_example_simple_lap_c_facto0_sched0_kway_rqrcpbegin Start 453: shm_example_simple_lap_c_facto0_sched0_kway_rqrcpend Start 454: shm_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpbegin Start 455: shm_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpend Start 456: shm_example_simple_lap_c_facto0_sched0_not_tqrcpbegin Start 457: shm_example_simple_lap_c_facto0_sched0_not_tqrcpend Start 458: shm_example_simple_lap_c_facto0_sched0_kway_tqrcpbegin Start 459: shm_example_simple_lap_c_facto0_sched0_kway_tqrcpend Start 460: shm_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpbegin Start 461: shm_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpend Start 462: shm_example_simple_lap_c_facto0_sched0_not_rqrrtbegin Start 463: shm_example_simple_lap_c_facto0_sched0_not_rqrrtend Start 464: shm_example_simple_lap_c_facto0_sched0_kway_rqrrtbegin Start 465: shm_example_simple_lap_c_facto0_sched0_kway_rqrrtend Start 466: shm_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtbegin Start 467: shm_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtend Start 468: shm_example_simple_lap_c_facto0_sched0_kway_pqrcpilu0 Start 469: shm_example_simple_lap_c_facto0_sched0_kway_pqrcpilu1 Start 470: shm_example_simple_lap_c_facto1_sched0_not_svdbegin Start 471: shm_example_simple_lap_c_facto1_sched0_not_svdend Start 472: shm_example_simple_lap_c_facto1_sched0_kway_svdbegin Start 473: shm_example_simple_lap_c_facto1_sched0_kway_svdend Start 474: shm_example_simple_lap_c_facto1_sched0_kwayprojections_svdbegin Start 475: shm_example_simple_lap_c_facto1_sched0_kwayprojections_svdend Start 476: shm_example_simple_lap_c_facto1_sched0_not_pqrcpbegin Start 477: shm_example_simple_lap_c_facto1_sched0_not_pqrcpend Start 478: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpbegin Start 479: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpend Start 480: shm_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpbegin Start 481: shm_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpend Start 482: shm_example_simple_lap_c_facto1_sched0_not_rqrcpbegin Start 483: shm_example_simple_lap_c_facto1_sched0_not_rqrcpend Start 484: shm_example_simple_lap_c_facto1_sched0_kway_rqrcpbegin Start 485: shm_example_simple_lap_c_facto1_sched0_kway_rqrcpend Start 486: shm_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpbegin Start 487: shm_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpend Start 488: shm_example_simple_lap_c_facto1_sched0_not_tqrcpbegin Start 489: shm_example_simple_lap_c_facto1_sched0_not_tqrcpend Start 490: shm_example_simple_lap_c_facto1_sched0_kway_tqrcpbegin Start 491: shm_example_simple_lap_c_facto1_sched0_kway_tqrcpend Start 492: shm_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpbegin Start 493: shm_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpend Start 494: shm_example_simple_lap_c_facto1_sched0_not_rqrrtbegin Start 495: shm_example_simple_lap_c_facto1_sched0_not_rqrrtend Start 496: shm_example_simple_lap_c_facto1_sched0_kway_rqrrtbegin Start 497: shm_example_simple_lap_c_facto1_sched0_kway_rqrrtend Start 498: shm_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtbegin Start 499: shm_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtend Start 500: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpilu0 Start 501: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpilu1 Start 502: shm_example_simple_lap_c_facto2_sched0_not_svdbegin Start 503: shm_example_simple_lap_c_facto2_sched0_not_svdend Start 504: shm_example_simple_lap_c_facto2_sched0_kway_svdbegin Start 505: shm_example_simple_lap_c_facto2_sched0_kway_svdend Start 506: shm_example_simple_lap_c_facto2_sched0_kwayprojections_svdbegin Start 507: shm_example_simple_lap_c_facto2_sched0_kwayprojections_svdend Start 508: shm_example_simple_lap_c_facto2_sched0_not_pqrcpbegin Start 509: shm_example_simple_lap_c_facto2_sched0_not_pqrcpend Start 510: shm_example_simple_lap_c_facto2_sched0_kway_pqrcpbegin Start 511: shm_example_simple_lap_c_facto2_sched0_kway_pqrcpend Start 512: shm_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpbegin 1/3626 Test #106: c_shm_example_reentrant_lap_s_facto1 .................................... Passed 167.65 sec Start 513: shm_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpend 2/3626 Test #120: c_shm_example_reentrant_lap_z_facto4 .................................... Passed 168.36 sec Start 514: shm_example_simple_lap_c_facto2_sched0_not_rqrcpbegin 3/3626 Test #109: c_shm_example_reentrant_lap_d_facto1 .................................... Passed 168.87 sec Start 515: shm_example_simple_lap_c_facto2_sched0_not_rqrcpend 4/3626 Test #59: c_shm_example_simple_trans_lap_c_facto4 ................................. Passed 169.31 sec Start 516: shm_example_simple_lap_c_facto2_sched0_kway_rqrcpbegin 5/3626 Test #60: c_shm_example_simple_trans_lap_z_facto0 ................................. Passed 170.00 sec Start 517: shm_example_simple_lap_c_facto2_sched0_kway_rqrcpend 6/3626 Test #105: c_shm_example_reentrant_lap_s_facto0 .................................... Passed 169.94 sec Start 518: shm_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpbegin 7/3626 Test #112: c_shm_example_reentrant_lap_c_facto1 .................................... Passed 171.58 sec Start 519: shm_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpend 8/3626 Test #64: c_shm_example_simple_trans_lap_z_facto4 ................................. Passed 171.98 sec Start 520: shm_example_simple_lap_c_facto2_sched0_not_tqrcpbegin 9/3626 Test #115: c_shm_example_reentrant_lap_c_facto4 .................................... Passed 171.85 sec Start 521: shm_example_simple_lap_c_facto2_sched0_not_tqrcpend 10/3626 Test #119: c_shm_example_reentrant_lap_z_facto3 .................................... Passed 171.96 sec Start 522: shm_example_simple_lap_c_facto2_sched0_kway_tqrcpbegin 11/3626 Test #117: c_shm_example_reentrant_lap_z_facto1 .................................... Passed 172.72 sec Start 523: shm_example_simple_lap_c_facto2_sched0_kway_tqrcpend 12/3626 Test #1: c_shm_example_analyze_lap_s_facto0 ...................................... Passed 174.26 sec Start 524: shm_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpbegin 13/3626 Test #116: c_shm_example_reentrant_lap_z_facto0 .................................... Passed 173.73 sec Start 525: shm_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpend 14/3626 Test #111: c_shm_example_reentrant_lap_c_facto0 .................................... Passed 173.80 sec Start 526: shm_example_simple_lap_c_facto2_sched0_not_rqrrtbegin 15/3626 Test #107: c_shm_example_reentrant_lap_s_facto2 .................................... Passed 174.46 sec Start 527: shm_example_simple_lap_c_facto2_sched0_not_rqrrtend 16/3626 Test #4: c_shm_example_analyze_lap_d_facto0 ...................................... Passed 175.28 sec Start 528: shm_example_simple_lap_c_facto2_sched0_kway_rqrrtbegin 17/3626 Test #3: c_shm_example_analyze_lap_s_facto2 ...................................... Passed 175.62 sec Start 529: shm_example_simple_lap_c_facto2_sched0_kway_rqrrtend 18/3626 Test #108: c_shm_example_reentrant_lap_d_facto0 .................................... Passed 175.76 sec Start 530: shm_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtbegin 19/3626 Test #114: c_shm_example_reentrant_lap_c_facto3 .................................... Passed 175.99 sec Start 531: shm_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtend 20/3626 Test #55: c_shm_example_simple_trans_lap_c_facto0 ................................. Passed 176.88 sec Start 532: shm_example_simple_lap_c_facto2_sched0_kway_pqrcpilu0 21/3626 Test #110: c_shm_example_reentrant_lap_d_facto2 .................................... Passed 177.00 sec Start 533: shm_example_simple_lap_c_facto2_sched0_kway_pqrcpilu1 22/3626 Test #113: c_shm_example_reentrant_lap_c_facto2 .................................... Passed 177.32 sec Start 534: shm_example_simple_lap_c_facto3_sched0_not_svdbegin 23/3626 Test #2: c_shm_example_analyze_lap_s_facto1 ...................................... Passed 178.14 sec Start 535: shm_example_simple_lap_c_facto3_sched0_not_svdend 24/3626 Test #118: c_shm_example_reentrant_lap_z_facto2 .................................... Passed 178.72 sec Start 536: shm_example_simple_lap_c_facto3_sched0_kway_svdbegin 25/3626 Test #9: c_shm_example_analyze_lap_c_facto2 ...................................... Passed 179.58 sec Start 537: shm_example_simple_lap_c_facto3_sched0_kway_svdend 26/3626 Test #6: c_shm_example_analyze_lap_d_facto2 ...................................... Passed 180.03 sec Start 538: shm_example_simple_lap_c_facto3_sched0_kwayprojections_svdbegin 27/3626 Test #7: c_shm_example_analyze_lap_c_facto0 ...................................... Passed 180.20 sec Start 539: shm_example_simple_lap_c_facto3_sched0_kwayprojections_svdend 28/3626 Test #14: c_shm_example_analyze_lap_z_facto2 ...................................... Passed 180.30 sec Start 540: shm_example_simple_lap_c_facto3_sched0_not_pqrcpbegin 29/3626 Test #224: shm_example_schur_lap_d_facto0_sched0_1d ................................ Passed 181.08 sec Start 541: shm_example_simple_lap_c_facto3_sched0_not_pqrcpend 30/3626 Test #16: c_shm_example_analyze_lap_z_facto4 ...................................... Passed 183.35 sec Start 542: shm_example_simple_lap_c_facto3_sched0_kway_pqrcpbegin 31/3626 Test #8: c_shm_example_analyze_lap_c_facto1 ...................................... Passed 183.73 sec Start 543: shm_example_simple_lap_c_facto3_sched0_kway_pqrcpend 32/3626 Test #249: shm_example_simple_lap_s_facto0_sched0_kway_svdend ...................... Passed 182.22 sec Start 544: shm_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpbegin 33/3626 Test #10: c_shm_example_analyze_lap_c_facto3 ...................................... Passed 183.96 sec Start 545: shm_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpend 34/3626 Test #5: c_shm_example_analyze_lap_d_facto1 ...................................... Passed 184.14 sec Start 546: shm_example_simple_lap_c_facto3_sched0_not_rqrcpbegin 35/3626 Test #229: shm_example_schur_lap_z_facto2_sched0_1d ................................ Passed 183.10 sec Start 547: shm_example_simple_lap_c_facto3_sched0_not_rqrcpend 36/3626 Test #189: shm_example_simple_lap_z_facto4_sched0_1d ............................... Passed 183.54 sec Start 548: shm_example_simple_lap_c_facto3_sched0_kway_rqrcpbegin 37/3626 Test #175: shm_example_simple_lap_s_facto1_sched0_1d ............................... Passed 183.80 sec Start 549: shm_example_simple_lap_c_facto3_sched0_kway_rqrcpend 38/3626 Test #181: shm_example_simple_lap_c_facto1_sched0_1d ............................... Passed 183.94 sec Start 550: shm_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpbegin 39/3626 Test #222: shm_example_schur_lap_s_facto0_sched0_1d ................................ Passed 183.80 sec Start 551: shm_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpend 40/3626 Test #239: shm_example_schur_lap_s_facto2_sched4_1d ................................ Passed 184.12 sec Start 552: shm_example_simple_lap_c_facto3_sched0_not_tqrcpbegin 41/3626 Test #394: shm_example_simple_lap_d_facto1_sched0_kway_tqrcpbegin .................. Passed 182.72 sec Start 553: shm_example_simple_lap_c_facto3_sched0_not_tqrcpend 42/3626 Test #228: shm_example_schur_lap_z_facto0_sched0_1d ................................ Passed 184.32 sec Start 554: shm_example_simple_lap_c_facto3_sched0_kway_tqrcpbegin 43/3626 Test #425: shm_example_simple_lap_d_facto2_sched0_not_tqrcpend ..................... Passed 182.65 sec Start 555: shm_example_simple_lap_c_facto3_sched0_kway_tqrcpend 44/3626 Test #255: shm_example_simple_lap_s_facto0_sched0_kway_pqrcpend .................... Passed 184.11 sec Start 556: shm_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpbegin 45/3626 Test #265: shm_example_simple_lap_s_facto0_sched0_not_tqrcpend ..................... Passed 184.09 sec Start 557: shm_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpend 46/3626 Test #403: shm_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtend ......... Passed 183.16 sec Start 558: shm_example_simple_lap_c_facto3_sched0_not_rqrrtbegin 47/3626 Test #234: shm_example_schur_lap_c_facto0_sched1_1d ................................ Passed 184.88 sec Start 559: shm_example_simple_lap_c_facto3_sched0_not_rqrrtend 48/3626 Test #334: shm_example_simple_lap_s_facto2_sched0_not_rqrrtbegin ................... Passed 184.10 sec Start 560: shm_example_simple_lap_c_facto3_sched0_kway_rqrrtbegin 49/3626 Test #126: c_shm_example_simple_single_mm .......................................... Passed 185.89 sec Start 561: shm_example_simple_lap_c_facto3_sched0_kway_rqrrtend 50/3626 Test #275: shm_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtend ......... Passed 184.85 sec Start 562: shm_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtbegin 51/3626 Test #83: c_shm_example_schur_lap_d_facto0 ........................................ Passed 186.39 sec Start 563: shm_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtend 52/3626 Test #370: shm_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtbegin ....... Passed 184.21 sec Start 564: shm_example_simple_lap_c_facto3_sched0_kway_pqrcpilu0 53/3626 Test #124: c_shm_example_simple_scotch_mm2 ......................................... Passed 186.24 sec Start 565: shm_example_simple_lap_c_facto3_sched0_kway_pqrcpilu1 54/3626 Test #391: shm_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpend ......... Passed 184.31 sec Start 566: shm_example_simple_lap_c_facto4_sched0_not_svdbegin 55/3626 Test #401: shm_example_simple_lap_d_facto1_sched0_kway_rqrrtend .................... Passed 184.24 sec Start 567: shm_example_simple_lap_c_facto4_sched0_not_svdend 56/3626 Test #81: c_shm_example_schur_lap_s_facto0 ........................................ Passed 186.80 sec Start 568: shm_example_simple_lap_c_facto4_sched0_kway_svdbegin 57/3626 Test #225: shm_example_schur_lap_d_facto2_sched0_1d ................................ Passed 186.07 sec Start 569: shm_example_simple_lap_c_facto4_sched0_kway_svdend 58/3626 Test #195: shm_example_simple_lap_d_facto2_sched1_1d ............................... Passed 186.32 sec Start 570: shm_example_simple_lap_c_facto4_sched0_kwayprojections_svdbegin 59/3626 Test #360: shm_example_simple_lap_d_facto0_sched0_not_tqrcpbegin ................... Passed 184.88 sec Start 571: shm_example_simple_lap_c_facto4_sched0_kwayprojections_svdend 60/3626 Test #226: shm_example_schur_lap_c_facto0_sched0_1d ................................ Passed 186.27 sec Start 572: shm_example_simple_lap_c_facto4_sched0_not_pqrcpbegin 61/3626 Test #21: c_shm_example_simple_lap_d_facto1 ....................................... Passed 187.50 sec Start 573: shm_example_simple_lap_c_facto4_sched0_not_pqrcpend 62/3626 Test #432: shm_example_simple_lap_d_facto2_sched0_kway_rqrrtbegin .................. Passed 184.38 sec Start 574: shm_example_simple_lap_c_facto4_sched0_kway_pqrcpbegin 63/3626 Test #174: shm_example_simple_lap_s_facto0_sched0_1d ............................... Passed 186.68 sec Start 575: shm_example_simple_lap_c_facto4_sched0_kway_pqrcpend 64/3626 Test #389: shm_example_simple_lap_d_facto1_sched0_kway_rqrcpend .................... Passed 184.87 sec Start 576: shm_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpbegin 65/3626 Test #455: shm_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpend ......... Passed 184.36 sec Start 577: shm_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpend 66/3626 Test #233: shm_example_schur_lap_d_facto2_sched1_1d ................................ Passed 186.49 sec Start 578: shm_example_simple_lap_c_facto4_sched0_not_rqrcpbegin 67/3626 Test #351: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpend .................... Passed 185.32 sec Start 579: shm_example_simple_lap_c_facto4_sched0_not_rqrcpend 68/3626 Test #29: c_shm_example_simple_lap_z_facto1 ....................................... Passed 187.77 sec Start 580: shm_example_simple_lap_c_facto4_sched0_kway_rqrcpbegin 69/3626 Test #273: shm_example_simple_lap_s_facto0_sched0_kway_rqrrtend .................... Passed 186.09 sec Start 581: shm_example_simple_lap_c_facto4_sched0_kway_rqrcpend 70/3626 Test #236: shm_example_schur_lap_z_facto0_sched1_1d ................................ Passed 186.75 sec Start 582: shm_example_simple_lap_c_facto4_sched0_kwayprojections_rqrcpbegin 71/3626 Test #408: shm_example_simple_lap_d_facto2_sched0_kway_svdbegin .................... Passed 185.15 sec Start 583: shm_example_simple_lap_c_facto4_sched0_kwayprojections_rqrcpend 72/3626 Test #380: shm_example_simple_lap_d_facto1_sched0_not_pqrcpbegin ................... Passed 185.46 sec Start 584: shm_example_simple_lap_c_facto4_sched0_not_tqrcpbegin 73/3626 Test #305: shm_example_simple_lap_s_facto1_sched0_kway_rqrrtend .................... Passed 186.09 sec Start 585: shm_example_simple_lap_c_facto4_sched0_not_tqrcpend 74/3626 Test #227: shm_example_schur_lap_c_facto2_sched0_1d ................................ Passed 186.97 sec Start 586: shm_example_simple_lap_c_facto4_sched0_kway_tqrcpbegin 75/3626 Test #85: c_shm_example_schur_lap_c_facto0 ........................................ Passed 187.92 sec Start 587: shm_example_simple_lap_c_facto4_sched0_kway_tqrcpend 76/3626 Test #274: shm_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtbegin ....... Passed 186.48 sec Start 588: shm_example_simple_lap_c_facto4_sched0_kwayprojections_tqrcpbegin 77/3626 Test #342: shm_example_simple_lap_d_facto0_sched0_not_svdbegin ..................... Passed 185.96 sec Start 589: shm_example_simple_lap_c_facto4_sched0_kwayprojections_tqrcpend 78/3626 Test #15: c_shm_example_analyze_lap_z_facto3 ...................................... Passed 188.56 sec Start 590: shm_example_simple_lap_c_facto4_sched0_not_rqrrtbegin 79/3626 Test #371: shm_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtend ......... Passed 185.94 sec Start 591: shm_example_simple_lap_c_facto4_sched0_not_rqrrtend 80/3626 Test #88: c_shm_example_schur_lap_z_facto2 ........................................ Passed 188.22 sec Start 592: shm_example_simple_lap_c_facto4_sched0_kway_rqrrtbegin 81/3626 Test #196: shm_example_simple_lap_c_facto0_sched1_1d ............................... Passed 187.62 sec Start 593: shm_example_simple_lap_c_facto4_sched0_kway_rqrrtend 82/3626 Test #241: shm_example_schur_lap_d_facto2_sched4_1d ................................ Passed 187.34 sec Start 594: shm_example_simple_lap_c_facto4_sched0_kwayprojections_rqrrtbegin 83/3626 Test #237: shm_example_schur_lap_z_facto2_sched1_1d ................................ Passed 187.56 sec Start 595: shm_example_simple_lap_c_facto4_sched0_kwayprojections_rqrrtend 84/3626 Test #433: shm_example_simple_lap_d_facto2_sched0_kway_rqrrtend .................... Passed 185.77 sec Start 596: shm_example_simple_lap_c_facto4_sched0_kway_pqrcpilu0 85/3626 Test #182: shm_example_simple_lap_c_facto2_sched0_1d ............................... Passed 188.04 sec Start 597: shm_example_simple_lap_c_facto4_sched0_kway_pqrcpilu1 86/3626 Test #179: shm_example_simple_lap_d_facto2_sched0_1d ............................... Passed 188.09 sec Start 598: shm_example_simple_lap_z_facto0_sched0_not_svdbegin 87/3626 Test #358: shm_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpbegin ....... Passed 186.62 sec Start 599: shm_example_simple_lap_z_facto0_sched0_not_svdend 88/3626 Test #128: c_shm_example_simple_single_mm2 ......................................... Passed 188.54 sec Start 600: shm_example_simple_lap_z_facto0_sched0_kway_svdbegin 89/3626 Test #180: shm_example_simple_lap_c_facto0_sched0_1d ............................... Passed 188.24 sec Start 601: shm_example_simple_lap_z_facto0_sched0_kway_svdend 90/3626 Test #511: shm_example_simple_lap_c_facto2_sched0_kway_pqrcpend .................... Passed 184.97 sec Start 602: shm_example_simple_lap_z_facto0_sched0_kwayprojections_svdbegin 91/3626 Test #304: shm_example_simple_lap_s_facto1_sched0_kway_rqrrtbegin .................. Passed 187.34 sec Start 603: shm_example_simple_lap_z_facto0_sched0_kwayprojections_svdend 92/3626 Test #449: shm_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpend ......... Passed 186.15 sec Start 604: shm_example_simple_lap_z_facto0_sched0_not_pqrcpbegin 93/3626 Test #399: shm_example_simple_lap_d_facto1_sched0_not_rqrrtend ..................... Passed 186.62 sec Start 605: shm_example_simple_lap_z_facto0_sched0_not_pqrcpend 94/3626 Test #431: shm_example_simple_lap_d_facto2_sched0_not_rqrrtend ..................... Passed 186.35 sec Start 606: shm_example_simple_lap_z_facto0_sched0_kway_pqrcpbegin 95/3626 Test #86: c_shm_example_schur_lap_c_facto2 ........................................ Passed 189.36 sec Start 607: shm_example_simple_lap_z_facto0_sched0_kway_pqrcpend 96/3626 Test #242: shm_example_schur_lap_c_facto0_sched4_1d ................................ Passed 188.39 sec Start 608: shm_example_simple_lap_z_facto0_sched0_kwayprojections_pqrcpbegin 97/3626 Test #303: shm_example_simple_lap_s_facto1_sched0_not_rqrrtend ..................... Passed 187.74 sec Start 609: shm_example_simple_lap_z_facto0_sched0_kwayprojections_pqrcpend 98/3626 Test #293: shm_example_simple_lap_s_facto1_sched0_kway_rqrcpend .................... Passed 187.88 sec Start 610: shm_example_simple_lap_z_facto0_sched0_not_rqrcpbegin 99/3626 Test #260: shm_example_simple_lap_s_facto0_sched0_kway_rqrcpbegin .................. Passed 188.23 sec Start 611: shm_example_simple_lap_z_facto0_sched0_not_rqrcpend 100/3626 Test #318: shm_example_simple_lap_s_facto2_sched0_kway_pqrcpbegin .................. Passed 187.90 sec Start 612: shm_example_simple_lap_z_facto0_sched0_kway_rqrcpbegin 101/3626 Test #261: shm_example_simple_lap_s_facto0_sched0_kway_rqrcpend .................... Passed 188.38 sec Start 613: shm_example_simple_lap_z_facto0_sched0_kway_rqrcpend 102/3626 Test #245: shm_example_schur_lap_z_facto2_sched4_1d ................................ Passed 188.53 sec Start 614: shm_example_simple_lap_z_facto0_sched0_kwayprojections_rqrcpbegin 103/3626 Test #402: shm_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtbegin ....... Passed 187.27 sec Start 615: shm_example_simple_lap_z_facto0_sched0_kwayprojections_rqrcpend 104/3626 Test #383: shm_example_simple_lap_d_facto1_sched0_kway_pqrcpend .................... Passed 187.50 sec Start 616: shm_example_simple_lap_z_facto0_sched0_not_tqrcpbegin 105/3626 Test #33: c_shm_example_simple_solve_and_refine_lap_s_facto0 ...................... Passed 190.16 sec Start 617: shm_example_simple_lap_z_facto0_sched0_not_tqrcpend 106/3626 Test #446: shm_example_simple_lap_c_facto0_sched0_kway_pqrcpbegin .................. Passed 187.03 sec Start 618: shm_example_simple_lap_z_facto0_sched0_kway_tqrcpbegin 107/3626 Test #40: c_shm_example_simple_solve_and_refine_lap_c_facto1 ...................... Passed 190.33 sec Start 619: shm_example_simple_lap_z_facto0_sched0_kway_tqrcpend 108/3626 Test #243: shm_example_schur_lap_c_facto2_sched4_1d ................................ Passed 189.18 sec Start 620: shm_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpbegin 109/3626 Test #301: shm_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpend ......... Passed 188.51 sec Start 621: shm_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpend 110/3626 Test #467: shm_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtend ......... Passed 187.10 sec Start 622: shm_example_simple_lap_z_facto0_sched0_not_rqrrtbegin 111/3626 Test #23: c_shm_example_simple_lap_c_facto0 ....................................... Passed 190.64 sec Start 623: shm_example_simple_lap_z_facto0_sched0_not_rqrrtend 112/3626 Test #323: shm_example_simple_lap_s_facto2_sched0_not_rqrcpend ..................... Passed 188.46 sec Start 624: shm_example_simple_lap_z_facto0_sched0_kway_rqrrtbegin 113/3626 Test #343: shm_example_simple_lap_d_facto0_sched0_not_svdend ....................... Passed 188.33 sec Start 625: shm_example_simple_lap_z_facto0_sched0_kway_rqrrtend 114/3626 Test #185: shm_example_simple_lap_z_facto0_sched0_1d ............................... Passed 189.77 sec Start 626: shm_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtbegin 115/3626 Test #317: shm_example_simple_lap_s_facto2_sched0_not_pqrcpend ..................... Passed 188.73 sec Start 627: shm_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtend 116/3626 Test #253: shm_example_simple_lap_s_facto0_sched0_not_pqrcpend ..................... Passed 189.26 sec Start 628: shm_example_simple_lap_z_facto0_sched0_kway_pqrcpilu0 117/3626 Test #235: shm_example_schur_lap_c_facto2_sched1_1d ................................ Passed 189.65 sec Start 629: shm_example_simple_lap_z_facto0_sched0_kway_pqrcpilu1 118/3626 Test #429: shm_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpend ......... Passed 187.90 sec Start 630: shm_example_simple_lap_z_facto1_sched0_not_svdbegin 119/3626 Test #38: c_shm_example_simple_solve_and_refine_lap_d_facto2 ...................... Passed 191.27 sec Start 631: shm_example_simple_lap_z_facto1_sched0_not_svdend 120/3626 Test #353: shm_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpend ......... Passed 189.00 sec Start 632: shm_example_simple_lap_z_facto1_sched0_kway_svdbegin 121/3626 Test #319: shm_example_simple_lap_s_facto2_sched0_kway_pqrcpend .................... Passed 189.32 sec Start 633: shm_example_simple_lap_z_facto1_sched0_kway_svdend 122/3626 Test #37: c_shm_example_simple_solve_and_refine_lap_d_facto1 ...................... Passed 192.15 sec 123/3626 Test #39: c_shm_example_simple_solve_and_refine_lap_c_facto0 ...................... Passed 192.14 sec 124/3626 Test #42: c_shm_example_simple_solve_and_refine_lap_c_facto3 ...................... Passed 192.13 sec 125/3626 Test #183: shm_example_simple_lap_c_facto3_sched0_1d ............................... Passed 191.29 sec 126/3626 Test #186: shm_example_simple_lap_z_facto1_sched0_1d ............................... Passed 191.27 sec 127/3626 Test #223: shm_example_schur_lap_s_facto2_sched0_1d ................................ Passed 191.05 sec 128/3626 Test #232: shm_example_schur_lap_d_facto0_sched1_1d ................................ Passed 190.99 sec 129/3626 Test #281: shm_example_simple_lap_s_facto1_sched0_kway_svdend ...................... Passed 190.38 sec 130/3626 Test #289: shm_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpend ......... Passed 190.31 sec 131/3626 Test #307: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrrtend ......... Passed 190.17 sec 132/3626 Test #349: shm_example_simple_lap_d_facto0_sched0_not_pqrcpend ..................... Passed 189.83 sec 133/3626 Test #462: shm_example_simple_lap_c_facto0_sched0_not_rqrrtbegin ................... Passed 188.84 sec 134/3626 Test #471: shm_example_simple_lap_c_facto1_sched0_not_svdend ....................... Passed 188.76 sec 135/3626 Test #495: shm_example_simple_lap_c_facto1_sched0_not_rqrrtend ..................... Passed 188.53 sec Start 634: shm_example_simple_lap_z_facto1_sched0_kwayprojections_svdbegin Start 635: shm_example_simple_lap_z_facto1_sched0_kwayprojections_svdend Start 636: shm_example_simple_lap_z_facto1_sched0_not_pqrcpbegin Start 637: shm_example_simple_lap_z_facto1_sched0_not_pqrcpend Start 638: shm_example_simple_lap_z_facto1_sched0_kway_pqrcpbegin Start 639: shm_example_simple_lap_z_facto1_sched0_kway_pqrcpend Start 640: shm_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpbegin Start 641: shm_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpend Start 642: shm_example_simple_lap_z_facto1_sched0_not_rqrcpbegin Start 643: shm_example_simple_lap_z_facto1_sched0_not_rqrcpend Start 644: shm_example_simple_lap_z_facto1_sched0_kway_rqrcpbegin Start 645: shm_example_simple_lap_z_facto1_sched0_kway_rqrcpend Start 646: shm_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpbegin Start 647: shm_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpend 136/3626 Test #244: shm_example_schur_lap_z_facto0_sched4_1d ................................ Passed 191.23 sec 137/3626 Test #176: shm_example_simple_lap_s_facto2_sched0_1d ............................... Passed 191.91 sec 138/3626 Test #184: shm_example_simple_lap_c_facto4_sched0_1d ............................... Passed 191.86 sec 139/3626 Test #187: shm_example_simple_lap_z_facto2_sched0_1d ............................... Passed 191.85 sec 140/3626 Test #188: shm_example_simple_lap_z_facto3_sched0_1d ............................... Passed 191.84 sec Start 648: shm_example_simple_lap_z_facto1_sched0_not_tqrcpbegin Start 649: shm_example_simple_lap_z_facto1_sched0_not_tqrcpend Start 650: shm_example_simple_lap_z_facto1_sched0_kway_tqrcpbegin Start 651: shm_example_simple_lap_z_facto1_sched0_kway_tqrcpend Start 652: shm_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpbegin 141/3626 Test #28: c_shm_example_simple_lap_z_facto0 ....................................... Passed 193.02 sec 142/3626 Test #297: shm_example_simple_lap_s_facto1_sched0_not_tqrcpend ..................... Passed 191.08 sec Start 653: shm_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpend Start 654: shm_example_simple_lap_z_facto1_sched0_not_rqrrtbegin 143/3626 Test #254: shm_example_simple_lap_s_facto0_sched0_kway_pqrcpbegin .................. Passed 191.49 sec Start 655: shm_example_simple_lap_z_facto1_sched0_not_rqrrtend 144/3626 Test #320: shm_example_simple_lap_s_facto2_sched0_kwayprojections_pqrcpbegin ....... Passed 191.09 sec Start 656: shm_example_simple_lap_z_facto1_sched0_kway_rqrrtbegin 145/3626 Test #331: shm_example_simple_lap_s_facto2_sched0_kway_tqrcpend .................... Passed 191.03 sec Start 657: shm_example_simple_lap_z_facto1_sched0_kway_rqrrtend 146/3626 Test #469: shm_example_simple_lap_c_facto0_sched0_kway_pqrcpilu1 ................... Passed 189.90 sec Start 658: shm_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtbegin 147/3626 Test #134: c_shm_example_simple_refine_gmres ....................................... Passed 193.05 sec 148/3626 Test #208: shm_example_simple_lap_s_facto2_sched4_1d ............................... Passed 192.69 sec 149/3626 Test #283: shm_example_simple_lap_s_facto1_sched0_kwayprojections_svdend ........... Passed 191.91 sec 150/3626 Test #365: shm_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpend ......... Passed 191.41 sec Start 659: shm_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtend Start 660: shm_example_simple_lap_z_facto1_sched0_kway_pqrcpilu0 Start 661: shm_example_simple_lap_z_facto1_sched0_kway_pqrcpilu1 Start 662: shm_example_simple_lap_z_facto2_sched0_not_svdbegin 151/3626 Test #512: shm_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpbegin ....... Passed 190.58 sec 152/3626 Test #329: shm_example_simple_lap_s_facto2_sched0_not_tqrcpend ..................... Passed 192.57 sec 153/3626 Test #43: c_shm_example_simple_solve_and_refine_lap_c_facto4 ...................... Passed 194.72 sec 154/3626 Test #485: shm_example_simple_lap_c_facto1_sched0_kway_rqrcpend .................... Passed 191.24 sec 155/3626 Test #344: shm_example_simple_lap_d_facto0_sched0_kway_svdbegin .................... Passed 192.47 sec 156/3626 Test #63: c_shm_example_simple_trans_lap_z_facto3 ................................. Passed 194.63 sec 157/3626 Test #193: shm_example_simple_lap_d_facto0_sched1_1d ............................... Passed 193.85 sec 158/3626 Test #423: shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpend ......... Passed 191.81 sec 159/3626 Test #306: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrrtbegin ....... Passed 192.80 sec 160/3626 Test #337: shm_example_simple_lap_s_facto2_sched0_kway_rqrrtend .................... Passed 192.55 sec 161/3626 Test #246: shm_example_simple_lap_s_facto0_sched0_not_svdbegin ..................... Passed 193.32 sec 162/3626 Test #268: shm_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpbegin ....... Passed 193.15 sec 163/3626 Test #270: shm_example_simple_lap_s_facto0_sched0_not_rqrrtbegin ................... Passed 193.13 sec 164/3626 Test #341: shm_example_simple_lap_s_facto2_sched0_kway_pqrcpilu1 ................... Passed 192.56 sec 165/3626 Test #361: shm_example_simple_lap_d_facto0_sched0_not_tqrcpend ..................... Passed 192.40 sec 166/3626 Test #456: shm_example_simple_lap_c_facto0_sched0_not_tqrcpbegin ................... Passed 191.58 sec 167/3626 Test #497: shm_example_simple_lap_c_facto1_sched0_kway_rqrrtend .................... Passed 191.18 sec Start 663: shm_example_simple_lap_z_facto2_sched0_not_svdend Start 664: shm_example_simple_lap_z_facto2_sched0_kway_svdbegin Start 665: shm_example_simple_lap_z_facto2_sched0_kway_svdend Start 666: shm_example_simple_lap_z_facto2_sched0_kwayprojections_svdbegin Start 667: shm_example_simple_lap_z_facto2_sched0_kwayprojections_svdend Start 668: shm_example_simple_lap_z_facto2_sched0_not_pqrcpbegin Start 669: shm_example_simple_lap_z_facto2_sched0_not_pqrcpend Start 670: shm_example_simple_lap_z_facto2_sched0_kway_pqrcpbegin Start 671: shm_example_simple_lap_z_facto2_sched0_kway_pqrcpend Start 672: shm_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpbegin Start 673: shm_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpend Start 674: shm_example_simple_lap_z_facto2_sched0_not_rqrcpbegin Start 675: shm_example_simple_lap_z_facto2_sched0_not_rqrcpend Start 676: shm_example_simple_lap_z_facto2_sched0_kway_rqrcpbegin Start 677: shm_example_simple_lap_z_facto2_sched0_kway_rqrcpend Start 678: shm_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpbegin Start 679: shm_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpend 168/3626 Test #247: shm_example_simple_lap_s_facto0_sched0_not_svdend ....................... Passed 195.43 sec 169/3626 Test #12: c_shm_example_analyze_lap_z_facto0 ...................................... Passed 197.08 sec 170/3626 Test #18: c_shm_example_simple_lap_s_facto1 ....................................... Passed 197.04 sec 171/3626 Test #397: shm_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpend ......... Passed 194.20 sec 172/3626 Test #202: shm_example_simple_lap_z_facto1_sched1_1d ............................... Passed 195.96 sec 173/3626 Test #279: shm_example_simple_lap_s_facto1_sched0_not_svdend ....................... Passed 195.18 sec 174/3626 Test #355: shm_example_simple_lap_d_facto0_sched0_not_rqrcpend ..................... Passed 194.56 sec 175/3626 Test #328: shm_example_simple_lap_s_facto2_sched0_not_tqrcpbegin ................... Passed 194.78 sec 176/3626 Test #464: shm_example_simple_lap_c_facto0_sched0_kway_rqrrtbegin .................. Passed 193.61 sec 177/3626 Test #82: c_shm_example_schur_lap_s_facto2 ........................................ Passed 196.70 sec 178/3626 Test #499: shm_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtend ......... Passed 193.25 sec 179/3626 Test #231: shm_example_schur_lap_s_facto2_sched1_1d ................................ Passed 195.79 sec 180/3626 Test #412: shm_example_simple_lap_d_facto2_sched0_not_pqrcpbegin ................... Passed 194.07 sec 181/3626 Test #287: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpend .................... Passed 195.12 sec 182/3626 Test #251: shm_example_simple_lap_s_facto0_sched0_kwayprojections_svdend ........... Passed 195.41 sec 183/3626 Test #441: shm_example_simple_lap_c_facto0_sched0_kway_svdend ...................... Passed 193.83 sec 184/3626 Test #501: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpilu1 ................... Passed 192.97 sec 185/3626 Test #19: c_shm_example_simple_lap_s_facto2 ....................................... Passed 197.07 sec 186/3626 Test #36: c_shm_example_simple_solve_and_refine_lap_d_facto0 ...................... Passed 196.98 sec 187/3626 Test #53: c_shm_example_simple_trans_lap_d_facto1 ................................. Passed 196.89 sec 188/3626 Test #61: c_shm_example_simple_trans_lap_z_facto1 ................................. Passed 196.85 sec 189/3626 Test #122: c_shm_example_simple_scotch_mm .......................................... Passed 196.48 sec 190/3626 Test #123: c_shm_example_simple_scotch_hb .......................................... Passed 196.48 sec 191/3626 Test #177: shm_example_simple_lap_d_facto0_sched0_1d ............................... Passed 196.15 sec 192/3626 Test #220: shm_example_simple_lap_z_facto3_sched4_1d ............................... Passed 195.89 sec 193/3626 Test #230: shm_example_schur_lap_s_facto0_sched1_1d ................................ Passed 195.83 sec 194/3626 Test #257: shm_example_simple_lap_s_facto0_sched0_kwayprojections_pqrcpend ......... Passed 195.39 sec 195/3626 Test #259: shm_example_simple_lap_s_facto0_sched0_not_rqrcpend ..................... Passed 195.38 sec 196/3626 Test #263: shm_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpend ......... Passed 195.35 sec 197/3626 Test #264: shm_example_simple_lap_s_facto0_sched0_not_tqrcpbegin ................... Passed 195.34 sec 198/3626 Test #266: shm_example_simple_lap_s_facto0_sched0_kway_tqrcpbegin .................. Passed 195.33 sec 199/3626 Test #285: shm_example_simple_lap_s_facto1_sched0_not_pqrcpend ..................... Passed 195.17 sec 200/3626 Test #313: shm_example_simple_lap_s_facto2_sched0_kway_svdend ...................... Passed 194.95 sec 201/3626 Test #321: shm_example_simple_lap_s_facto2_sched0_kwayprojections_pqrcpend ......... Passed 194.88 sec 202/3626 Test #327: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrcpend ......... Passed 194.84 sec 203/3626 Test #340: shm_example_simple_lap_s_facto2_sched0_kway_pqrcpilu0 ................... Passed 194.73 sec 204/3626 Test #345: shm_example_simple_lap_d_facto0_sched0_kway_svdend ...................... Passed 194.69 sec 205/3626 Test #352: shm_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpbegin ....... Passed 194.64 sec 206/3626 Test #354: shm_example_simple_lap_d_facto0_sched0_not_rqrcpbegin ................... Passed 194.62 sec 207/3626 Test #357: shm_example_simple_lap_d_facto0_sched0_kway_rqrcpend .................... Passed 194.60 sec 208/3626 Test #366: shm_example_simple_lap_d_facto0_sched0_not_rqrrtbegin ................... Passed 194.53 sec 209/3626 Test #368: shm_example_simple_lap_d_facto0_sched0_kway_rqrrtbegin .................. Passed 194.52 sec 210/3626 Test #385: shm_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpend ......... Passed 194.38 sec 211/3626 Test #405: shm_example_simple_lap_d_facto1_sched0_kway_pqrcpilu1 ................... Passed 194.20 sec 212/3626 Test #407: shm_example_simple_lap_d_facto2_sched0_not_svdend ....................... Passed 194.19 sec 213/3626 Test #411: shm_example_simple_lap_d_facto2_sched0_kwayprojections_svdend ........... Passed 194.13 sec 214/3626 Test #414: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpbegin .................. Passed 194.11 sec 215/3626 Test #427: shm_example_simple_lap_d_facto2_sched0_kway_tqrcpend .................... Passed 194.00 sec 216/3626 Test #444: shm_example_simple_lap_c_facto0_sched0_not_pqrcpbegin ................... Passed 193.86 sec 217/3626 Test #448: shm_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpbegin ....... Passed 193.82 sec 218/3626 Test #454: shm_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpbegin ....... Passed 193.77 sec 219/3626 Test #461: shm_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpend ......... Passed 193.71 sec 220/3626 Test #468: shm_example_simple_lap_c_facto0_sched0_kway_pqrcpilu0 ................... Passed 193.65 sec Start 680: shm_example_simple_lap_z_facto2_sched0_not_tqrcpbegin Start 681: shm_example_simple_lap_z_facto2_sched0_not_tqrcpend Start 682: shm_example_simple_lap_z_facto2_sched0_kway_tqrcpbegin Start 683: shm_example_simple_lap_z_facto2_sched0_kway_tqrcpend Start 684: shm_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpbegin Start 685: shm_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpend Start 686: shm_example_simple_lap_z_facto2_sched0_not_rqrrtbegin Start 687: shm_example_simple_lap_z_facto2_sched0_not_rqrrtend Start 688: shm_example_simple_lap_z_facto2_sched0_kway_rqrrtbegin Start 689: shm_example_simple_lap_z_facto2_sched0_kway_rqrrtend Start 690: shm_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtbegin Start 691: shm_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtend Start 692: shm_example_simple_lap_z_facto2_sched0_kway_pqrcpilu0 Start 693: shm_example_simple_lap_z_facto2_sched0_kway_pqrcpilu1 Start 694: shm_example_simple_lap_z_facto3_sched0_not_svdbegin Start 695: shm_example_simple_lap_z_facto3_sched0_not_svdend Start 696: shm_example_simple_lap_z_facto3_sched0_kway_svdbegin Start 697: shm_example_simple_lap_z_facto3_sched0_kway_svdend Start 698: shm_example_simple_lap_z_facto3_sched0_kwayprojections_svdbegin Start 699: shm_example_simple_lap_z_facto3_sched0_kwayprojections_svdend Start 700: shm_example_simple_lap_z_facto3_sched0_not_pqrcpbegin Start 701: shm_example_simple_lap_z_facto3_sched0_not_pqrcpend Start 702: shm_example_simple_lap_z_facto3_sched0_kway_pqrcpbegin Start 703: shm_example_simple_lap_z_facto3_sched0_kway_pqrcpend Start 704: shm_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpbegin Start 705: shm_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpend Start 706: shm_example_simple_lap_z_facto3_sched0_not_rqrcpbegin Start 707: shm_example_simple_lap_z_facto3_sched0_not_rqrcpend Start 708: shm_example_simple_lap_z_facto3_sched0_kway_rqrcpbegin Start 709: shm_example_simple_lap_z_facto3_sched0_kway_rqrcpend Start 710: shm_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpbegin Start 711: shm_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpend Start 712: shm_example_simple_lap_z_facto3_sched0_not_tqrcpbegin Start 713: shm_example_simple_lap_z_facto3_sched0_not_tqrcpend Start 714: shm_example_simple_lap_z_facto3_sched0_kway_tqrcpbegin Start 715: shm_example_simple_lap_z_facto3_sched0_kway_tqrcpend Start 716: shm_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpbegin Start 717: shm_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpend Start 718: shm_example_simple_lap_z_facto3_sched0_not_rqrrtbegin Start 719: shm_example_simple_lap_z_facto3_sched0_not_rqrrtend Start 720: shm_example_simple_lap_z_facto3_sched0_kway_rqrrtbegin Start 721: shm_example_simple_lap_z_facto3_sched0_kway_rqrrtend Start 722: shm_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtbegin Start 723: shm_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtend Start 724: shm_example_simple_lap_z_facto3_sched0_kway_pqrcpilu0 Start 725: shm_example_simple_lap_z_facto3_sched0_kway_pqrcpilu1 Start 726: shm_example_simple_lap_z_facto4_sched0_not_svdbegin Start 727: shm_example_simple_lap_z_facto4_sched0_not_svdend Start 728: shm_example_simple_lap_z_facto4_sched0_kway_svdbegin Start 729: shm_example_simple_lap_z_facto4_sched0_kway_svdend Start 730: shm_example_simple_lap_z_facto4_sched0_kwayprojections_svdbegin Start 731: shm_example_simple_lap_z_facto4_sched0_kwayprojections_svdend Start 732: shm_example_simple_lap_z_facto4_sched0_not_pqrcpbegin 221/3626 Test #276: shm_example_simple_lap_s_facto0_sched0_kway_pqrcpilu0 ................... Passed 201.21 sec 222/3626 Test #483: shm_example_simple_lap_c_facto1_sched0_not_rqrcpend ..................... Passed 199.45 sec 223/3626 Test #507: shm_example_simple_lap_c_facto2_sched0_kwayprojections_svdend ........... Passed 198.88 sec 224/3626 Test #296: shm_example_simple_lap_s_facto1_sched0_not_tqrcpbegin ................... Passed 201.05 sec 225/3626 Test #487: shm_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpend ......... Passed 199.41 sec 226/3626 Test #350: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpbegin .................. Passed 200.62 sec 227/3626 Test #417: shm_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpend ......... Passed 200.03 sec 228/3626 Test #424: shm_example_simple_lap_d_facto2_sched0_not_tqrcpbegin ................... Passed 199.98 sec 229/3626 Test #11: c_shm_example_analyze_lap_c_facto4 ...................................... Passed 203.10 sec 230/3626 Test #262: shm_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpbegin ....... Passed 201.33 sec 231/3626 Test #45: c_shm_example_simple_solve_and_refine_lap_z_facto1 ...................... Passed 202.92 sec 232/3626 Test #48: c_shm_example_simple_solve_and_refine_lap_z_facto4 ...................... Passed 202.90 sec 233/3626 Test #312: shm_example_simple_lap_s_facto2_sched0_kway_svdbegin .................... Passed 200.93 sec 234/3626 Test #240: shm_example_schur_lap_d_facto0_sched4_1d ................................ Passed 201.74 sec 235/3626 Test #400: shm_example_simple_lap_d_facto1_sched0_kway_rqrrtbegin .................. Passed 200.21 sec 236/3626 Test #190: shm_example_simple_lap_s_facto0_sched1_1d ............................... Passed 202.06 sec 237/3626 Test #496: shm_example_simple_lap_c_facto1_sched0_kway_rqrrtbegin .................. Passed 199.32 sec 238/3626 Test #314: shm_example_simple_lap_s_facto2_sched0_kwayprojections_svdbegin ......... Passed 200.92 sec 239/3626 Test #336: shm_example_simple_lap_s_facto2_sched0_kway_rqrrtbegin .................. Passed 200.74 sec 240/3626 Test #213: shm_example_simple_lap_c_facto1_sched4_1d ............................... Passed 201.93 sec 241/3626 Test #346: shm_example_simple_lap_d_facto0_sched0_kwayprojections_svdbegin ......... Passed 200.67 sec 242/3626 Test #84: c_shm_example_schur_lap_d_facto2 ........................................ Passed 202.72 sec 243/3626 Test #31: c_shm_example_simple_lap_z_facto3 ....................................... Passed 203.01 sec 244/3626 Test #238: shm_example_schur_lap_s_facto0_sched4_1d ................................ Passed 201.77 sec 245/3626 Test #272: shm_example_simple_lap_s_facto0_sched0_kway_rqrrtbegin .................. Passed 201.27 sec 246/3626 Test #22: c_shm_example_simple_lap_d_facto2 ....................................... Passed 203.11 sec 247/3626 Test #27: c_shm_example_simple_lap_c_facto4 ....................................... Passed 203.09 sec 248/3626 Test #35: c_shm_example_simple_solve_and_refine_lap_s_facto2 ...................... Passed 203.04 sec 249/3626 Test #41: c_shm_example_simple_solve_and_refine_lap_c_facto2 ...................... Passed 203.01 sec 250/3626 Test #44: c_shm_example_simple_solve_and_refine_lap_z_facto0 ...................... Passed 203.00 sec 251/3626 Test #52: c_shm_example_simple_trans_lap_d_facto0 ................................. Passed 202.95 sec 252/3626 Test #58: c_shm_example_simple_trans_lap_c_facto3 ................................. Passed 202.92 sec 253/3626 Test #127: c_shm_example_simple_single_hb .......................................... Passed 202.51 sec 254/3626 Test #194: shm_example_simple_lap_d_facto1_sched1_1d ............................... Passed 202.11 sec 255/3626 Test #210: shm_example_simple_lap_d_facto1_sched4_1d ............................... Passed 202.01 sec 256/3626 Test #211: shm_example_simple_lap_d_facto2_sched4_1d ............................... Passed 202.01 sec 257/3626 Test #250: shm_example_simple_lap_s_facto0_sched0_kwayprojections_svdbegin ......... Passed 201.51 sec 258/3626 Test #256: shm_example_simple_lap_s_facto0_sched0_kwayprojections_pqrcpbegin ....... Passed 201.46 sec 259/3626 Test #267: shm_example_simple_lap_s_facto0_sched0_kway_tqrcpend .................... Passed 201.38 sec 260/3626 Test #269: shm_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpend ......... Passed 201.36 sec 261/3626 Test #271: shm_example_simple_lap_s_facto0_sched0_not_rqrrtend ..................... Passed 201.35 sec 262/3626 Test #277: shm_example_simple_lap_s_facto0_sched0_kway_pqrcpilu1 ................... Passed 201.30 sec 263/3626 Test #292: shm_example_simple_lap_s_facto1_sched0_kway_rqrcpbegin .................. Passed 201.18 sec 264/3626 Test #299: shm_example_simple_lap_s_facto1_sched0_kway_tqrcpend .................... Passed 201.12 sec 265/3626 Test #302: shm_example_simple_lap_s_facto1_sched0_not_rqrrtbegin ................... Passed 201.10 sec 266/3626 Test #324: shm_example_simple_lap_s_facto2_sched0_kway_rqrcpbegin .................. Passed 200.92 sec 267/3626 Test #338: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtbegin ....... Passed 200.81 sec 268/3626 Test #347: shm_example_simple_lap_d_facto0_sched0_kwayprojections_svdend ........... Passed 200.74 sec 269/3626 Test #348: shm_example_simple_lap_d_facto0_sched0_not_pqrcpbegin ................... Passed 200.73 sec 270/3626 Test #356: shm_example_simple_lap_d_facto0_sched0_kway_rqrcpbegin .................. Passed 200.67 sec 271/3626 Test #367: shm_example_simple_lap_d_facto0_sched0_not_rqrrtend ..................... Passed 200.58 sec 272/3626 Test #369: shm_example_simple_lap_d_facto0_sched0_kway_rqrrtend .................... Passed 200.56 sec 273/3626 Test #377: shm_example_simple_lap_d_facto1_sched0_kway_svdend ...................... Passed 200.50 sec 274/3626 Test #379: shm_example_simple_lap_d_facto1_sched0_kwayprojections_svdend ........... Passed 200.48 sec 275/3626 Test #384: shm_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpbegin ....... Passed 200.45 sec 276/3626 Test #386: shm_example_simple_lap_d_facto1_sched0_not_rqrcpbegin ................... Passed 200.43 sec 277/3626 Test #390: shm_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpbegin ....... Passed 200.40 sec 278/3626 Test #392: shm_example_simple_lap_d_facto1_sched0_not_tqrcpbegin ................... Passed 200.38 sec 279/3626 Test #396: shm_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpbegin ....... Passed 200.34 sec 280/3626 Test #398: shm_example_simple_lap_d_facto1_sched0_not_rqrrtbegin ................... Passed 200.33 sec 281/3626 Test #404: shm_example_simple_lap_d_facto1_sched0_kway_pqrcpilu0 ................... Passed 200.28 sec 282/3626 Test #421: shm_example_simple_lap_d_facto2_sched0_kway_rqrcpend .................... Passed 200.11 sec 283/3626 Test #435: shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtend ......... Passed 199.99 sec 284/3626 Test #457: shm_example_simple_lap_c_facto0_sched0_not_tqrcpend ..................... Passed 199.80 sec 285/3626 Test #458: shm_example_simple_lap_c_facto0_sched0_kway_tqrcpbegin .................. Passed 199.79 sec 286/3626 Test #475: shm_example_simple_lap_c_facto1_sched0_kwayprojections_svdend ........... Passed 199.64 sec 287/3626 Test #477: shm_example_simple_lap_c_facto1_sched0_not_pqrcpend ..................... Passed 199.63 sec 288/3626 Test #480: shm_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpbegin ....... Passed 199.60 sec 289/3626 Test #481: shm_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpend ......... Passed 199.59 sec 290/3626 Test #486: shm_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpbegin ....... Passed 199.55 sec 291/3626 Test #491: shm_example_simple_lap_c_facto1_sched0_kway_tqrcpend .................... Passed 199.50 sec 292/3626 Test #13: c_shm_example_analyze_lap_z_facto1 ......................................***Timeout 203.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.567761e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.478527e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.621335e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.540368e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.081783e+01 s Start 13: c_shm_example_analyze_lap_z_facto1 292/3626 Test #24: c_shm_example_simple_lap_c_facto1 .......................................***Timeout 203.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.660818e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.980946e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.431468e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.195356e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.682699e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.292701e-03 s Time to initialize coeftab 2.556374e-01 s Time to factorize 4.758692e+00 s ( 4.48 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.705264e+00 s Time for refinement 8.446026e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.060124e-07 max(|| b_i - A x_i ||_1) 8.802963e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.221251e+00 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 5.495277e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 8.054917e-01 (SUCCESS) Start 24: c_shm_example_simple_lap_c_facto1 292/3626 Test #25: c_shm_example_simple_lap_c_facto2 .......................................***Timeout 203.60 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.907030e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.488716e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.011348e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.879957e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.952263e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.386729e-02 s Time to initialize coeftab 8.824649e-02 s Time to factorize 8.648280e+00 s ( 4.62 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 9.084609e-01 s Time for refinement 2.358908e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.983266e-07 max(|| b_i - A x_i ||_1) 8.395987e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.118559e+00 (SUCCESS) max(|| x_i ||_oo) 6.822264e-01 max(|| x0_i - x_i ||_oo) 3.885747e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 5.695686e-01 (SUCCESS) Start 25: c_shm_example_simple_lap_c_facto2 292/3626 Test #26: c_shm_example_simple_lap_c_facto3 .......................................***Timeout 203.77 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.386213e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.879292e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.782054e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.743003e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.432942e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.457177e-03 s Time to initialize coeftab 4.855288e-02 s Time to factorize 5.198990e+00 s ( 3.90 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 9.749103e-01 s Time for refinement 5.857179e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.015872e-07 max(|| b_i - A x_i ||_1) 9.024480e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.277146e+00 (SUCCESS) max(|| x_i ||_oo) 6.822264e-01 max(|| x0_i - x_i ||_oo) 3.793216e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 5.560056e-01 (SUCCESS) Start 26: c_shm_example_simple_lap_c_facto3 292/3626 Test #32: c_shm_example_simple_lap_z_facto4 .......................................***Timeout 204.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.082709e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.017647e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.316458e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.915957e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.328352e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.717022e+00 s Time to initialize coeftab 1.194278e-01 s Time to factorize 3.426426e+00 s ( 6.22 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.995854e+00 s Time for refinement 7.736824e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.833680e-16 max(|| b_i - A x_i ||_1) 1.871445e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.722292e-03 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 1.621268e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.376437e-03 (SUCCESS) Start 32: c_shm_example_simple_lap_z_facto4 292/3626 Test #50: c_shm_example_simple_trans_lap_s_facto1 .................................***Timeout 204.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.058672e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.691515e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.924203e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.774637e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.158359e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.100515e-02 s Time to initialize coeftab 1.177050e-01 s Time to factorize 1.727975e+00 s ( 3.03 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 2.398505e+00 s Time for refinement 1.216923e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.979600e-07 max(|| b_i - A x_i ||_1) 8.420070e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.058036e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 3.576279e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.158130e-01 (SUCCESS) Start 50: c_shm_example_simple_trans_lap_s_facto1 292/3626 Test #51: c_shm_example_simple_trans_lap_s_facto2 .................................***Timeout 204.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.515970e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.993165e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.323595e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 8.800171e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.574007e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.178855e-03 s Time to initialize coeftab 1.970434e-01 s Time to factorize 3.105261e+00 s ( 3.22 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.819317e+00 s Time for refinement 1.650151e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.972065e-07 max(|| b_i - A x_i ||_1) 8.343743e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.048445e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 3.576279e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.158130e-01 (SUCCESS) Start 51: c_shm_example_simple_trans_lap_s_facto2 292/3626 Test #56: c_shm_example_simple_trans_lap_c_facto1 .................................***Timeout 204.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.966207e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.081353e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.103710e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.428587e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.064162e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.583750e-03 s Time to initialize coeftab 1.485367e-01 s Time to factorize 4.441026e+00 s ( 4.80 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 2.691377e+00 s Time for refinement 1.107465e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.116631e-07 max(|| b_i - A x_i ||_1) 8.943162e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.256627e+00 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 6.078505e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 8.909807e-01 (SUCCESS) Start 56: c_shm_example_simple_trans_lap_c_facto1 292/3626 Test #57: c_shm_example_simple_trans_lap_c_facto2 .................................***Timeout 204.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.279445e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.602221e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.058341e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.017237e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.329392e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.413305e-03 s Time to initialize coeftab 3.093024e-01 s Time to factorize 1.209810e+01 s ( 3.30 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.619664e+00 s Time for refinement 1.267347e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.061956e-07 max(|| b_i - A x_i ||_1) 8.804545e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.221650e+00 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 3.919884e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 5.745723e-01 (SUCCESS) Start 57: c_shm_example_simple_trans_lap_c_facto2 292/3626 Test #62: c_shm_example_simple_trans_lap_z_facto2 .................................***Timeout 204.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.414994e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.094641e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.884426e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 5.245116e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.487773e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.334851e-03 s Time to initialize coeftab 3.074594e-01 s Time to factorize 7.737377e+00 s ( 5.17 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 2.49 Mo Time to solve 1.677866e+00 s Time for refinement 7.486434e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.814487e-16 max(|| b_i - A x_i ||_1) 1.871089e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.721392e-03 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 1.374392e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.014568e-03 (SUCCESS) Start 62: c_shm_example_simple_trans_lap_z_facto2 292/3626 Test #87: c_shm_example_schur_lap_z_facto0 ........................................***Timeout 205.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.315516e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 100810 Fill-in of L 27.245946 Time to compute symbol matrix 3.087096e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.789701e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 100810 Fill-in 27.245946 Number of operations in full-rank: LL^h 62.62 MFlops Prediction: Model AMD 6180 MKL Time to factorize 3.571986e-03 s Time for mapping/scheduling 5.068562e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.433298e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.311435e-02 s Time to initialize coeftab 1.051604e-01 s Time to factorize 3.490592e+00 s (17.94 MFlop/s) Number of operations 84.28 MFlops Number of static pivots 0 Memory usage of coeftab 2.09 Mo || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.343509e-16 max(|| b_i - A x_i ||_1) 2.560443e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.819430e+01 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 2.448161e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 3.232223e+01 (SUCCESS) Start 87: c_shm_example_schur_lap_z_facto0 292/3626 Test #135: c_shm_example_simple_refine_bicgstab ....................................***Timeout 205.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.966359e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 51109 Fill-in of L 7.452464 Time to compute symbol matrix 1.384257e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.218005e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 102218 Fill-in 14.904929 Number of operations in full-rank: LU 5.50 MFlops Prediction: Model AMD 6180 MKL Time to factorize 7.121319e-04 s Time for mapping/scheduling 1.178776e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.992992e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.842801e-03 s Time to initialize coeftab 6.048786e-02 s Time to factorize 1.138677e+00 s ( 4.83 MFlop/s) Number of operations 2.27 MFlops Number of static pivots 0 Memory usage of coeftab 987 Ko Time to solve 1.293877e+00 s Time for refinement 1.158020e+00 s || A ||_1 3.076897e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.497923e-16 max(|| b_i - A x_i ||_1) 1.671681e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.309078e-03 (SUCCESS) Start 135: c_shm_example_simple_refine_bicgstab 292/3626 Test #155: c_shm_example_simple_mixed_refine_gmres .................................***Timeout 205.67 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.610887e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 51109 Fill-in of L 7.452464 Time to compute symbol matrix 8.098970e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.023898e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 102218 Fill-in 14.904929 Number of operations in full-rank: LU 5.50 MFlops Prediction: Model AMD 6180 MKL Time to factorize 7.121319e-04 s Time for mapping/scheduling 6.528029e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.697888e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.665970e-01 s Time to initialize coeftab 6.834865e-02 s Time to factorize 7.749361e-01 s ( 7.10 MFlop/s) Number of operations 2.27 MFlops Number of static pivots 0 Memory usage of coeftab 494 Ko Time to solve 1.270772e+00 s - iteration 1 : total iteration time 1.04 s error 1.6916e-11 - iteration 2 : total iteration time 0.899 s error 1.5247e-15 Time for refinement 2.828267e+00 s || A ||_1 3.076897e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.532322e-15 max(|| b_i - A x_i ||_1) 6.014640e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.710009e-03 (SUCCESS) Start 155: c_shm_example_simple_mixed_refine_gmres 292/3626 Test #156: c_shm_example_simple_mixed_refine_bicgstab ..............................***Timeout 205.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.141787e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 51109 Fill-in of L 7.452464 Time to compute symbol matrix 7.266297e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.441330e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 102218 Fill-in 14.904929 Number of operations in full-rank: LU 5.50 MFlops Prediction: Model AMD 6180 MKL Time to factorize 7.121319e-04 s Time for mapping/scheduling 2.374678e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.175271e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.974666e-03 s Time to initialize coeftab 5.057258e-02 s Time to factorize 2.117648e+00 s ( 2.60 MFlop/s) Number of operations 2.27 MFlops Number of static pivots 0 Memory usage of coeftab 494 Ko Time to solve 1.600163e+00 s - iteration 1 : total iteration time 2.96 s error 2.6799e-15 Time for refinement 3.967685e+00 s || A ||_1 3.076897e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.661225e-15 max(|| b_i - A x_i ||_1) 1.036698e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.118285e-03 (SUCCESS) Start 156: c_shm_example_simple_mixed_refine_bicgstab 292/3626 Test #157: c_shm_example_simple_mixed_lap_d_facto0 .................................***Timeout 205.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.880511e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.271722e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.152643e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.930493e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.065682e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.465853e-01 s Time to initialize coeftab 4.488772e-01 s Time to factorize 1.517646e+00 s ( 3.34 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 1.818241e+00 s - iteration 1 : total iteration time 0.751 s error 6.5095e-14 Time for refinement 2.885086e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.509718e-14 max(|| b_i - A x_i ||_1) 1.923143e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.416594e-01 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.716960e-13 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 3.436595e-01 (SUCCESS) Start 157: c_shm_example_simple_mixed_lap_d_facto0 292/3626 Test #158: c_shm_example_simple_mixed_lap_d_facto1 .................................***Timeout 205.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.471747e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.647415e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.672007e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.263888e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.622969e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.501335e-01 s Time to initialize coeftab 1.882293e-01 s Time to factorize 1.285798e+00 s ( 4.07 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 1.611331e+00 s - iteration 1 : total iteration time 0.765 s error 7.1443e-14 Time for refinement 1.937105e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.145200e-14 max(|| b_i - A x_i ||_1) 1.776255e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.232017e-01 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 2.853828e-13 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 5.712103e-01 (SUCCESS) Start 158: c_shm_example_simple_mixed_lap_d_facto1 292/3626 Test #160: c_shm_example_simple_mixed_lap_d_refine_cg_sym ..........................***Timeout 205.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.705306e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.721895e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.977509e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 8.842447e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.845416e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.449124e-03 s Time to initialize coeftab 1.838350e-01 s Time to factorize 2.133357e+00 s ( 4.68 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.950201e+00 s - iteration 1 : total iteration time 2.33 s error 5.9067e-14 Time for refinement 3.446538e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.907183e-14 max(|| b_i - A x_i ||_1) 1.569470e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.972174e-01 (SUCCESS) Start 160: c_shm_example_simple_mixed_lap_d_refine_cg_sym 292/3626 Test #162: c_shm_example_simple_mixed_lap_d_refine_bicgstab_sym ....................***Timeout 205.71 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.810985e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.304376e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.271652e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.216172e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.927893e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.351972e-03 s Time to initialize coeftab 3.144192e-01 s Time to factorize 1.894713e+00 s ( 5.27 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 2.031963e+00 s - iteration 1 : total iteration time 4.33 s error 2.7626e-20 Time for refinement 6.140925e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.198919e-16 max(|| b_i - A x_i ||_1) 6.038518e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.587917e-04 (SUCCESS) Start 162: c_shm_example_simple_mixed_lap_d_refine_bicgstab_sym 292/3626 Test #163: c_shm_example_simple_mixed_lap_z_facto0 .................................***Timeout 205.72 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.353258e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.589010e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.822471e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 8.130027e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.377333e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.464695e-03 s Time to initialize coeftab 1.618101e-01 s Time to factorize 5.726152e+00 s ( 3.54 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.546876e+00 s - iteration 1 : total iteration time 1.36 s error 7.2655e-14 Time for refinement 2.968654e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.264829e-14 max(|| b_i - A x_i ||_1) 2.066471e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.214408e-01 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 2.960508e-13 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 4.339481e-01 (SUCCESS) Start 163: c_shm_example_simple_mixed_lap_z_facto0 292/3626 Test #165: c_shm_example_simple_mixed_lap_z_facto2 .................................***Timeout 205.73 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.827672e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.466916e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.464358e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.845383e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.863431e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.229775e-02 s Time to initialize coeftab 1.509265e-01 s Time to factorize 4.790027e+00 s ( 8.34 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.707323e+00 s - iteration 1 : total iteration time 1.36 s error 6.0407e-14 Time for refinement 2.868620e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.040519e-14 max(|| b_i - A x_i ||_1) 1.730654e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.367026e-01 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 3.318439e-13 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 4.864133e-01 (SUCCESS) Start 165: c_shm_example_simple_mixed_lap_z_facto2 292/3626 Test #166: c_shm_example_simple_mixed_lap_z_facto3 .................................***Timeout 205.74 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.654665e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.916465e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.974703e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 9.474594e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.827389e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.376961e-03 s Time to initialize coeftab 2.992107e-01 s Time to factorize 4.117842e+00 s ( 4.93 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.879046e+00 s - iteration 1 : total iteration time 2.07 s error 7.9417e-14 Time for refinement 3.308207e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.941150e-14 max(|| b_i - A x_i ||_1) 2.072529e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.229693e-01 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 4.665319e-13 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 6.838374e-01 (SUCCESS) Start 166: c_shm_example_simple_mixed_lap_z_facto3 292/3626 Test #169: c_shm_example_simple_mixed_lap_z_refine_gmres_her .......................***Timeout 205.77 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.686951e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 84938 Fill-in of L 7.401359 Time to compute symbol matrix 6.642302e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.957182e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 169876 Fill-in 14.802719 Number of operations in full-rank: LU 62.91 MFlops Prediction: Model AMD 6180 MKL Time to factorize 2.138097e-03 s Time for mapping/scheduling 6.343701e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.771685e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.062885e-02 s Time to initialize coeftab 3.237910e-01 s Time to factorize 6.940195e+00 s ( 9.07 MFlop/s) Number of operations 16.83 MFlops Number of static pivots 0 Memory usage of coeftab 1.61 Mo Time to solve 1.773036e+00 s - iteration 1 : total iteration time 1.97 s error 1.2492e-13 Time for refinement 3.081602e+00 s || A ||_1 5.530574e-02 max(|| b_i ||_oo) 2.785213e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.249107e-13 max(|| b_i - A x_i ||_1) 3.582198e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.778308e-01 (SUCCESS) Start 169: c_shm_example_simple_mixed_lap_z_refine_gmres_her 292/3626 Test #171: c_shm_example_simple_mixed_lap_z_refine_cg_sym ..........................***Timeout 205.79 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.727370e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.459802e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.731021e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.846603e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.837244e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.611881e-03 s Time to initialize coeftab 2.928827e-01 s Time to factorize 5.001646e+00 s ( 7.99 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.518164e+00 s - iteration 1 : total iteration time 0.985 s error 7.0377e-14 Time for refinement 2.319505e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.038440e-14 max(|| b_i - A x_i ||_1) 1.896237e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.784849e-01 (SUCCESS) Start 171: c_shm_example_simple_mixed_lap_z_refine_cg_sym 292/3626 Test #172: c_shm_example_simple_mixed_lap_z_refine_gmres_sym .......................***Timeout 205.80 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.793205e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.385545e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.458135e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.417645e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.885469e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.203221e-02 s Time to initialize coeftab 5.931413e-02 s Time to factorize 4.550337e+00 s ( 8.78 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.898562e+00 s - iteration 1 : total iteration time 1.19 s error 6.2569e-14 Time for refinement 2.793753e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.257563e-14 max(|| b_i - A x_i ||_1) 1.727246e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.358429e-01 (SUCCESS) Start 172: c_shm_example_simple_mixed_lap_z_refine_gmres_sym 292/3626 Test #178: shm_example_simple_lap_d_facto1_sched0_1d ...............................***Timeout 205.79 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.527729e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.980775e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.208689e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.831318e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.639478e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.928326e-03 s Time to initialize coeftab 8.211606e-01 s Time to factorize 1.898706e+00 s ( 2.76 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 2.348728e-02 s Time for refinement 1.773356e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.583590e-16 max(|| b_i - A x_i ||_1) 1.860623e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.338032e-03 (SUCCESS) Start 178: shm_example_simple_lap_d_facto1_sched0_1d 292/3626 Test #197: shm_example_simple_lap_c_facto1_sched1_1d ...............................***Timeout 205.73 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.737755e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.877461e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.954902e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.644908e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.786275e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.434995e-03 s Time to initialize coeftab 1.579434e-01 s Time to factorize 5.524848e+00 s ( 3.86 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 8.814214e-01 s Time for refinement 7.085730e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.057354e-07 max(|| b_i - A x_i ||_1) 8.794822e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.219197e+00 (SUCCESS) Start 197: shm_example_simple_lap_c_facto1_sched1_1d 292/3626 Test #198: shm_example_simple_lap_c_facto2_sched1_1d ...............................***Timeout 205.74 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.631132e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.636703e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.041798e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 5.520108e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.738357e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.504925e-03 s Time to initialize coeftab 5.173794e-01 s Time to factorize 6.798840e+00 s ( 5.88 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.046528e+00 s Time for refinement 9.341913e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.026626e-07 max(|| b_i - A x_i ||_1) 8.520973e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.150096e+00 (SUCCESS) Start 198: shm_example_simple_lap_c_facto2_sched1_1d 292/3626 Test #200: shm_example_simple_lap_c_facto4_sched1_1d ...............................***Timeout 205.76 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.691570e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.625682e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.990243e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.259907e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.806472e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 9.500635e-03 s Time to initialize coeftab 1.766584e-01 s Time to factorize 5.625069e+00 s ( 3.79 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.614686e+00 s Time for refinement 6.411656e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.046461e-07 max(|| b_i - A x_i ||_1) 8.768652e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.212593e+00 (SUCCESS) Start 200: shm_example_simple_lap_c_facto4_sched1_1d 292/3626 Test #201: shm_example_simple_lap_z_facto0_sched1_1d ...............................***Timeout 205.77 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.522900e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.731666e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.549718e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 3.031003e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.560301e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.490499e-02 s Time to initialize coeftab 1.208715e-01 s Time to factorize 4.730694e+00 s ( 4.29 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 7.961472e-01 s Time for refinement 5.608985e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.996546e-16 max(|| b_i - A x_i ||_1) 2.009524e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.070711e-03 (SUCCESS) Start 201: shm_example_simple_lap_z_facto0_sched1_1d 292/3626 Test #203: shm_example_simple_lap_z_facto2_sched1_1d ...............................***Timeout 205.76 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.438624e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.936697e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.685186e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.067015e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.519437e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.806128e-01 s Time to initialize coeftab 8.826931e-02 s Time to factorize 5.892888e+00 s ( 6.78 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 2.49 Mo Time to solve 6.437244e-01 s Time for refinement 1.243771e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.736340e-16 max(|| b_i - A x_i ||_1) 1.786963e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.509114e-03 (SUCCESS) Start 203: shm_example_simple_lap_z_facto2_sched1_1d 292/3626 Test #204: shm_example_simple_lap_z_facto3_sched1_1d ...............................***Timeout 205.77 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.869014e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.698005e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.098210e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 3.091298e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.923810e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.061047e-01 s Time to initialize coeftab 1.529066e-01 s Time to factorize 4.630401e+00 s ( 4.38 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 6.466091e-01 s Time for refinement 9.114609e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.988809e-16 max(|| b_i - A x_i ||_1) 2.015169e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.084954e-03 (SUCCESS) Start 204: shm_example_simple_lap_z_facto3_sched1_1d 292/3626 Test #205: shm_example_simple_lap_z_facto4_sched1_1d ...............................***Timeout 205.77 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.828502e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.801555e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.447096e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.469950e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.950718e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.354485e-02 s Time to initialize coeftab 2.041001e-01 s Time to factorize 4.945181e+00 s ( 4.31 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.013660e+00 s Time for refinement 4.502575e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.808582e-16 max(|| b_i - A x_i ||_1) 1.858889e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.690607e-03 (SUCCESS) Start 205: shm_example_simple_lap_z_facto4_sched1_1d 292/3626 Test #209: shm_example_simple_lap_d_facto0_sched4_1d ...............................***Timeout 205.80 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.501891e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.151850e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.086003e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.324542e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.724629e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.450213e-01 s Time to initialize coeftab 8.174558e-02 s Time to factorize 1.710298e+00 s ( 2.96 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 9.705889e-01 s Time for refinement 1.196120e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.684849e-16 max(|| b_i - A x_i ||_1) 1.936456e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.433323e-03 (SUCCESS) Start 209: shm_example_simple_lap_d_facto0_sched4_1d 292/3626 Test #214: shm_example_simple_lap_c_facto2_sched4_1d ...............................***Timeout 205.79 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.387362e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.168654e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.555039e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.311372e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.688468e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.763679e-01 s Time to initialize coeftab 1.044122e-01 s Time to factorize 5.216146e+00 s ( 7.66 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 2.154012e+00 s Time for refinement 4.744680e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.992063e-07 max(|| b_i - A x_i ||_1) 8.458418e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.134312e+00 (SUCCESS) Start 214: shm_example_simple_lap_c_facto2_sched4_1d 292/3626 Test #215: shm_example_simple_lap_c_facto3_sched4_1d ...............................***Timeout 205.80 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.694284e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.586280e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.676344e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 5.359699e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.796123e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.534330e-03 s Time to initialize coeftab 1.289384e-01 s Time to factorize 4.369330e+00 s ( 4.64 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.315330e+00 s Time for refinement 9.317376e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.023631e-07 max(|| b_i - A x_i ||_1) 9.047981e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.283076e+00 (SUCCESS) Start 215: shm_example_simple_lap_c_facto3_sched4_1d 292/3626 Test #216: shm_example_simple_lap_c_facto4_sched4_1d ...............................***Timeout 205.80 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.818129e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.982225e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.682012e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.001290e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.862512e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.572924e-02 s Time to initialize coeftab 1.836812e-01 s Time to factorize 3.828991e+00 s ( 5.56 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 2.009701e+00 s Time for refinement 2.066204e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.056409e-07 max(|| b_i - A x_i ||_1) 8.844120e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.231636e+00 (SUCCESS) Start 216: shm_example_simple_lap_c_facto4_sched4_1d 292/3626 Test #218: shm_example_simple_lap_z_facto1_sched4_1d ...............................***Timeout 205.82 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.769015e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.325060e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.267010e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.918562e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.969516e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.800791e+00 s Time to initialize coeftab 7.565657e-02 s Time to factorize 3.623352e+00 s ( 5.88 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.571429e+00 s Time for refinement 7.502823e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.784682e-16 max(|| b_i - A x_i ||_1) 1.866392e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.709540e-03 (SUCCESS) Start 218: shm_example_simple_lap_z_facto1_sched4_1d 292/3626 Test #219: shm_example_simple_lap_z_facto2_sched4_1d ...............................***Timeout 205.82 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.872343e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.660213e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.330820e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.263109e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.946693e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.165514e-02 s Time to initialize coeftab 5.771961e-02 s Time to factorize 3.441079e+00 s (11.62 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 2.49 Mo Time to solve 1.760706e+00 s Time for refinement 6.796154e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.667897e-16 max(|| b_i - A x_i ||_1) 1.793030e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.524422e-03 (SUCCESS) Start 219: shm_example_simple_lap_z_facto2_sched4_1d 292/3626 Test #252: shm_example_simple_lap_s_facto0_sched0_not_pqrcpbegin ...................***Timeout 205.40 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.803980e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.249926e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.516684e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 8.077704e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.003856e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.805200e-01 s Time to initialize coeftab 3.051222e-01 s Time to factorize 5.742002e+00 s (902.78 KFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 1.529699e-01 s - iteration 1 : total iteration time 0.138 s error 1.1591e-11 Time for refinement 1.409897e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.735402e-08 max(|| b_i - A x_i ||_1) 2.804297e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.523782e-01 (SUCCESS) Start 252: shm_example_simple_lap_s_facto0_sched0_not_pqrcpbegin 292/3626 Test #258: shm_example_simple_lap_s_facto0_sched0_not_rqrcpbegin ...................***Timeout 205.36 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.686983e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.038627e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.482286e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 8.672245e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.776206e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.236004e-03 s Time to initialize coeftab 5.876736e-01 s Time to factorize 5.612277e+00 s (923.65 KFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 4.285698e-02 s - iteration 1 : total iteration time 0.0229 s error 3.3785e-11 Time for refinement 8.998179e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.767364e-08 max(|| b_i - A x_i ||_1) 2.783079e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.497119e-01 (SUCCESS) Start 258: shm_example_simple_lap_s_facto0_sched0_not_rqrcpbegin 292/3626 Test #280: shm_example_simple_lap_s_facto1_sched0_kway_svdbegin ....................***Timeout 205.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.546644e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.261171e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.291633e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.386229e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.041124e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.490634e-01 s Time to initialize coeftab 3.837976e-01 s Time to factorize 8.713884e-01 s ( 6.01 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 5.484477e-03 s Time for refinement 3.112454e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.249276e-07 max(|| b_i - A x_i ||_1) 9.406115e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.181939e+00 (SUCCESS) Start 280: shm_example_simple_lap_s_facto1_sched0_kway_svdbegin 292/3626 Test #282: shm_example_simple_lap_s_facto1_sched0_kwayprojections_svdbegin .........***Timeout 205.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.791867e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.584342e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.083537e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.477869e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.920358e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.522814e-01 s Time to initialize coeftab 1.637680e+00 s Time to factorize 4.557849e+00 s ( 1.15 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 3.817273e-01 s Time for refinement 1.352860e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.249276e-07 max(|| b_i - A x_i ||_1) 9.406115e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.181939e+00 (SUCCESS) Start 282: shm_example_simple_lap_s_facto1_sched0_kwayprojections_svdbegin 292/3626 Test #286: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpbegin ..................***Timeout 205.20 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.569023e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.188625e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.089206e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.214877e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.734042e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.505273e-01 s Time to initialize coeftab 8.073425e-01 s Time to factorize 4.106649e+00 s ( 1.27 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 5.833321e-03 s - iteration 1 : total iteration time 0.00393 s error 1.1418e-11 Time for refinement 8.114501e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.710365e-08 max(|| b_i - A x_i ||_1) 2.856391e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.589241e-01 (SUCCESS) Start 286: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpbegin 292/3626 Test #288: shm_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpbegin .......***Timeout 205.20 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.765472e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.654235e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.855141e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.992232e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.796480e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.152464e-02 s Time to initialize coeftab 4.538015e-01 s Time to factorize 3.434879e+00 s ( 1.52 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 5.952975e-03 s - iteration 1 : total iteration time 0.00397 s error 1.1418e-11 Time for refinement 1.108953e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.710365e-08 max(|| b_i - A x_i ||_1) 2.856391e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.589241e-01 (SUCCESS) Start 288: shm_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpbegin 292/3626 Test #291: shm_example_simple_lap_s_facto1_sched0_not_rqrcpend .....................***Timeout 205.20 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.614089e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.294857e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.674523e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.762537e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.224581e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.776440e-03 s Time to initialize coeftab 6.614541e-01 s Time to factorize 5.233634e-01 s (10.00 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 1.114724e-02 s Time for refinement 5.143964e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.900394e-07 max(|| b_i - A x_i ||_1) 8.189143e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.029019e+00 (SUCCESS) Start 291: shm_example_simple_lap_s_facto1_sched0_not_rqrcpend 292/3626 Test #294: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpbegin .......***Timeout 205.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.562116e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.758954e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.254485e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.415852e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.834392e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.168045e-03 s Time to initialize coeftab 6.156398e-01 s Time to factorize 4.027936e-01 s (12.99 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 5.805980e-03 s - iteration 1 : total iteration time 0.0065 s error 3.3713e-11 Time for refinement 1.070224e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.805603e-08 max(|| b_i - A x_i ||_1) 2.876591e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.614624e-01 (SUCCESS) Start 294: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpbegin 292/3626 Test #300: shm_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpbegin .......***Timeout 205.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.808990e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.918670e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.653255e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.420092e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.948103e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.231799e-02 s Time to initialize coeftab 1.511482e+00 s Time to factorize 6.832341e+00 s (784.37 KFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 5.352483e-02 s - iteration 1 : total iteration time 0.0192 s error 3.3727e-11 Time for refinement 5.367082e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.806737e-08 max(|| b_i - A x_i ||_1) 2.878308e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.616781e-01 (SUCCESS) Start 300: shm_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpbegin 292/3626 Test #309: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpilu1 ...................***Timeout 205.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.427637e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.157584e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.389687e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.182341e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.795638e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.591784e-01 s Time to initialize coeftab 4.189282e-01 s Time to factorize 9.589324e-01 s ( 5.46 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 2.771189e-02 s Time for refinement 1.237993e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.892954e-07 max(|| b_i - A x_i ||_1) 8.165908e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.026099e+00 (SUCCESS) Start 309: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpilu1 292/3626 Test #326: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrcpbegin .......***Timeout 205.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.816974e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.848806e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.656933e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.032925e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.846949e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.891139e-02 s Time to initialize coeftab 1.294746e+00 s Time to factorize 6.395713e+00 s ( 1.56 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 5.389573e-03 s - iteration 1 : total iteration time 0.00401 s error 3.3467e-11 Time for refinement 8.172594e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.874981e-08 max(|| b_i - A x_i ||_1) 2.911590e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.658601e-01 (SUCCESS) Start 326: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrcpbegin 292/3626 Test #330: shm_example_simple_lap_s_facto2_sched0_kway_tqrcpbegin ..................***Timeout 205.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.555325e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.750490e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.242499e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.054381e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.608875e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.794286e-01 s Time to initialize coeftab 1.495681e+00 s Time to factorize 7.295529e+00 s ( 1.37 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 1.154454e-02 s - iteration 1 : total iteration time 0.0103 s error 3.3478e-11 Time for refinement 1.757612e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.868993e-08 max(|| b_i - A x_i ||_1) 2.906678e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.652429e-01 (SUCCESS) Start 330: shm_example_simple_lap_s_facto2_sched0_kway_tqrcpbegin 292/3626 Test #332: shm_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpbegin .......***Timeout 205.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.847478e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.232354e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.419331e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.577211e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.954808e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.904879e-01 s Time to initialize coeftab 1.060242e+00 s Time to factorize 4.601664e+00 s ( 2.17 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 5.235947e-03 s - iteration 1 : total iteration time 0.00403 s error 3.3478e-11 Time for refinement 1.032813e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.868993e-08 max(|| b_i - A x_i ||_1) 2.906678e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.652429e-01 (SUCCESS) Start 332: shm_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpbegin 292/3626 Test #339: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtend .........***Timeout 205.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.870791e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.032062e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.656823e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.223387e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.201297e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.527306e-01 s Time to initialize coeftab 2.905315e-01 s Time to factorize 3.253946e+00 s ( 3.07 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 5.003698e-03 s Time for refinement 3.129074e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.893975e-07 max(|| b_i - A x_i ||_1) 8.091629e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.016766e+00 (SUCCESS) Start 339: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtend 292/3626 Test #359: shm_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpend .........***Timeout 204.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.723800e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.799286e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.898661e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.369037e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.976900e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.080342e-03 s Time to initialize coeftab 3.073247e-01 s Time to factorize 1.371969e+00 s ( 3.69 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.133395e-02 s Time for refinement 1.797324e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.649207e-16 max(|| b_i - A x_i ||_1) 1.929752e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.424899e-03 (SUCCESS) Start 359: shm_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpend 292/3626 Test #362: shm_example_simple_lap_d_facto0_sched0_kway_tqrcpbegin ..................***Timeout 204.90 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.705253e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.196191e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.196329e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 3.054649e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.250567e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.916885e-03 s Time to initialize coeftab 1.110623e+00 s Time to factorize 1.047530e+00 s ( 4.83 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 5.281438e-03 s - iteration 1 : total iteration time 0.00425 s error 5.4326e-14 Time for refinement 1.117559e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.432761e-14 max(|| b_i - A x_i ||_1) 5.780977e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.264294e-02 (SUCCESS) Start 362: shm_example_simple_lap_d_facto0_sched0_kway_tqrcpbegin 292/3626 Test #363: shm_example_simple_lap_d_facto0_sched0_kway_tqrcpend ....................***Timeout 204.90 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.839200e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.219402e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.197145e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 3.818325e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.953164e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.251187e-02 s Time to initialize coeftab 2.005693e-01 s Time to factorize 6.616210e+00 s (783.50 KFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.467014e-01 s Time for refinement 2.812041e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.649207e-16 max(|| b_i - A x_i ||_1) 1.929752e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.424899e-03 (SUCCESS) Start 363: shm_example_simple_lap_d_facto0_sched0_kway_tqrcpend 292/3626 Test #364: shm_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpbegin .......***Timeout 204.90 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.865246e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.439703e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.823724e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.408400e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.086607e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.773238e-01 s Time to initialize coeftab 1.235181e+00 s Time to factorize 6.165629e+00 s (840.76 KFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 5.182324e-03 s - iteration 1 : total iteration time 0.0037 s error 5.4326e-14 Time for refinement 7.838809e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.432761e-14 max(|| b_i - A x_i ||_1) 5.780977e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.264294e-02 (SUCCESS) Start 364: shm_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpbegin 292/3626 Test #372: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpilu0 ...................***Timeout 204.85 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.940231e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.229899e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.851247e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.942138e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.302415e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.674045e-01 s Time to initialize coeftab 1.022216e-01 s Time to factorize 3.264675e+00 s ( 1.55 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.285855e-02 s - iteration 1 : total iteration time 0.00703 s error 5.9009e-15 Time for refinement 1.549243e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.896858e-15 max(|| b_i - A x_i ||_1) 5.917185e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.435452e-03 (SUCCESS) Start 372: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpilu0 292/3626 Test #375: shm_example_simple_lap_d_facto1_sched0_not_svdend .......................***Timeout 204.87 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.787372e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.584125e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.262372e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.388002e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.858345e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.679853e-01 s Time to initialize coeftab 2.647398e-01 s Time to factorize 6.127044e+00 s (874.66 KFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 7.778977e-03 s Time for refinement 2.791930e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.583590e-16 max(|| b_i - A x_i ||_1) 1.860623e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.338032e-03 (SUCCESS) Start 375: shm_example_simple_lap_d_facto1_sched0_not_svdend 292/3626 Test #376: shm_example_simple_lap_d_facto1_sched0_kway_svdbegin ....................***Timeout 204.87 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.545220e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.080323e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.746904e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.982182e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.024887e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.685485e-03 s Time to initialize coeftab 6.272361e-01 s Time to factorize 1.294063e+00 s ( 4.04 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 5.714655e-03 s - iteration 1 : total iteration time 0.00372 s error 1.5585e-14 Time for refinement 8.018747e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.558197e-14 max(|| b_i - A x_i ||_1) 2.715471e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412222e-02 (SUCCESS) Start 376: shm_example_simple_lap_d_facto1_sched0_kway_svdbegin 292/3626 Test #378: shm_example_simple_lap_d_facto1_sched0_kwayprojections_svdbegin .........***Timeout 204.87 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.596334e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.499577e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.699125e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.006991e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.638220e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.316811e-03 s Time to initialize coeftab 9.018970e-01 s Time to factorize 7.306522e+00 s (733.46 KFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 4.405916e-02 s - iteration 1 : total iteration time 0.0231 s error 1.5585e-14 Time for refinement 6.380900e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.558197e-14 max(|| b_i - A x_i ||_1) 2.715471e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412222e-02 (SUCCESS) Start 378: shm_example_simple_lap_d_facto1_sched0_kwayprojections_svdbegin 292/3626 Test #382: shm_example_simple_lap_d_facto1_sched0_kway_pqrcpbegin ..................***Timeout 204.86 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.831348e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.993405e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.036414e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.948445e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.909449e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.725199e-02 s Time to initialize coeftab 3.130382e-01 s Time to factorize 6.582405e+00 s (814.15 KFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 5.987247e-02 s - iteration 1 : total iteration time 0.0288 s error 1.4439e-14 Time for refinement 8.113590e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.443611e-14 max(|| b_i - A x_i ||_1) 2.423022e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.044735e-02 (SUCCESS) Start 382: shm_example_simple_lap_d_facto1_sched0_kway_pqrcpbegin 292/3626 Test #388: shm_example_simple_lap_d_facto1_sched0_kway_rqrcpbegin ..................***Timeout 204.84 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.722014e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.781326e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.737195e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.287204e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.748471e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.248688e-02 s Time to initialize coeftab 1.305871e+00 s Time to factorize 5.030729e+00 s ( 1.04 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.188637e-02 s - iteration 1 : total iteration time 0.00698 s error 5.4302e-14 Time for refinement 1.726593e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.430448e-14 max(|| b_i - A x_i ||_1) 5.955393e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.483463e-02 (SUCCESS) Start 388: shm_example_simple_lap_d_facto1_sched0_kway_rqrcpbegin 292/3626 Test #393: shm_example_simple_lap_d_facto1_sched0_not_tqrcpend .....................***Timeout 204.81 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.904362e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.421261e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.508371e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.800767e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.052648e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.637322e-02 s Time to initialize coeftab 1.052446e-01 s Time to factorize 4.629501e-01 s (11.30 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.622126e-01 s Time for refinement 2.434186e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.583590e-16 max(|| b_i - A x_i ||_1) 1.860623e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.338032e-03 (SUCCESS) Start 393: shm_example_simple_lap_d_facto1_sched0_not_tqrcpend 292/3626 Test #409: shm_example_simple_lap_d_facto2_sched0_kway_svdend ......................***Timeout 204.71 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.582944e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.448673e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.869605e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 7.681656e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.735842e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.805060e-01 s Time to initialize coeftab 1.095327e-01 s Time to factorize 3.754637e+00 s ( 2.66 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 6.947261e-02 s Time for refinement 3.642104e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.643681e-16 max(|| b_i - A x_i ||_1) 1.795106e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.255705e-03 (SUCCESS) Start 409: shm_example_simple_lap_d_facto2_sched0_kway_svdend 292/3626 Test #413: shm_example_simple_lap_d_facto2_sched0_not_pqrcpend .....................***Timeout 204.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.833680e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.350720e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.781678e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.786684e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.925424e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.186332e-01 s Time to initialize coeftab 4.152601e-01 s Time to factorize 6.480198e+00 s ( 1.54 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.124479e-02 s Time for refinement 6.076490e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.643681e-16 max(|| b_i - A x_i ||_1) 1.795106e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.255705e-03 (SUCCESS) Start 413: shm_example_simple_lap_d_facto2_sched0_not_pqrcpend 292/3626 Test #416: shm_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpbegin .......***Timeout 204.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.655987e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.218776e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.811228e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 8.825503e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.746725e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.099411e-03 s Time to initialize coeftab 3.334387e-01 s Time to factorize 8.583791e+00 s ( 1.16 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 7.126171e-03 s - iteration 1 : total iteration time 0.00403 s error 1.4439e-14 Time for refinement 8.200744e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.443744e-14 max(|| b_i - A x_i ||_1) 2.425004e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.047226e-02 (SUCCESS) Start 416: shm_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpbegin 292/3626 Test #419: shm_example_simple_lap_d_facto2_sched0_not_rqrcpend .....................***Timeout 204.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.772311e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.826345e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.181799e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.824088e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.055402e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.631446e-02 s Time to initialize coeftab 4.118782e-02 s Time to factorize 1.795296e+00 s ( 5.56 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 7.742818e-02 s Time for refinement 3.922620e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.643681e-16 max(|| b_i - A x_i ||_1) 1.795106e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.255705e-03 (SUCCESS) Start 419: shm_example_simple_lap_d_facto2_sched0_not_rqrcpend 292/3626 Test #420: shm_example_simple_lap_d_facto2_sched0_kway_rqrcpbegin ..................***Timeout 204.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.792113e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.353862e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.741722e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.769224e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.876099e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.765776e-02 s Time to initialize coeftab 1.449835e+00 s Time to factorize 5.456054e+00 s ( 1.83 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 5.285719e-03 s - iteration 1 : total iteration time 0.00385 s error 5.5242e-14 Time for refinement 9.304537e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.524210e-14 max(|| b_i - A x_i ||_1) 6.118383e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.688275e-02 (SUCCESS) Start 420: shm_example_simple_lap_d_facto2_sched0_kway_rqrcpbegin 292/3626 Test #422: shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpbegin .......***Timeout 204.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.970025e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.715065e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.483995e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.454666e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.987001e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.369562e-03 s Time to initialize coeftab 1.836396e+00 s Time to factorize 7.406434e+00 s ( 1.35 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 5.255098e-03 s - iteration 1 : total iteration time 0.00384 s error 5.5242e-14 Time for refinement 7.877281e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.524210e-14 max(|| b_i - A x_i ||_1) 6.118383e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.688275e-02 (SUCCESS) Start 422: shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpbegin 292/3626 Test #426: shm_example_simple_lap_d_facto2_sched0_kway_tqrcpbegin ..................***Timeout 204.66 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.787974e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.232722e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.107138e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 9.734022e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.972781e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.419472e-03 s Time to initialize coeftab 1.633052e+00 s Time to factorize 2.666863e+00 s ( 3.74 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 5.203085e-03 s - iteration 1 : total iteration time 0.00362 s error 5.5242e-14 Time for refinement 9.128791e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.524207e-14 max(|| b_i - A x_i ||_1) 6.118706e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.688680e-02 (SUCCESS) Start 426: shm_example_simple_lap_d_facto2_sched0_kway_tqrcpbegin 292/3626 Test #428: shm_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpbegin .......***Timeout 204.65 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.464376e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.098890e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.409604e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.223184e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.652933e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.341563e-01 s Time to initialize coeftab 1.506868e+00 s Time to factorize 1.230282e+00 s ( 8.12 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 6.591851e-03 s - iteration 1 : total iteration time 0.00391 s error 5.5242e-14 Time for refinement 8.097890e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.524207e-14 max(|| b_i - A x_i ||_1) 6.118706e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.688680e-02 (SUCCESS) Start 428: shm_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpbegin 292/3626 Test #436: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpilu0 ...................***Timeout 204.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.742960e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.295055e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.171518e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.409625e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.790484e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.338352e-03 s Time to initialize coeftab 1.667289e-01 s Time to factorize 7.258627e+00 s ( 1.38 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 4.060854e-02 s - iteration 1 : total iteration time 0.0221 s error 5.9009e-15 Time for refinement 8.514178e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.904661e-15 max(|| b_i - A x_i ||_1) 5.905572e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.420859e-03 (SUCCESS) Start 436: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpilu0 292/3626 Test #437: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpilu1 ...................***Timeout 204.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.389651e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.814645e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.516526e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 8.469559e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.561255e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.338612e-01 s Time to initialize coeftab 5.294840e-01 s Time to factorize 2.367456e+00 s ( 4.22 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.126182e-02 s - iteration 1 : total iteration time 0.0103 s error 2.3229e-15 Time for refinement 1.771900e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.329973e-15 max(|| b_i - A x_i ||_1) 1.354561e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.702123e-03 (SUCCESS) Start 437: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpilu1 292/3626 Test #445: shm_example_simple_lap_c_facto0_sched0_not_pqrcpend .....................***Timeout 204.63 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.881841e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.806257e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.202234e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 3.393595e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.969352e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.577820e-03 s Time to initialize coeftab 9.907111e-02 s Time to factorize 3.112310e+00 s ( 6.52 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 9.156459e-02 s Time for refinement 2.410359e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.087415e-07 max(|| b_i - A x_i ||_1) 9.200869e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.321654e+00 (SUCCESS) Start 445: shm_example_simple_lap_c_facto0_sched0_not_pqrcpend 292/3626 Test #450: shm_example_simple_lap_c_facto0_sched0_not_rqrcpbegin ...................***Timeout 204.61 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.425682e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.049279e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.504166e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.134828e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.655950e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.731313e-01 s Time to initialize coeftab 1.666258e+00 s Time to factorize 2.706922e+00 s ( 7.49 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.990959e-02 s - iteration 1 : total iteration time 0.0367 s error 5.2041e-11 Time for refinement 4.963943e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.507199e-08 max(|| b_i - A x_i ||_1) 3.249837e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.200309e-01 (SUCCESS) Start 450: shm_example_simple_lap_c_facto0_sched0_not_rqrcpbegin 292/3626 Test #453: shm_example_simple_lap_c_facto0_sched0_kway_rqrcpend ....................***Timeout 204.63 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.562634e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.058585e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.623381e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.076786e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.003007e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.260985e-02 s Time to initialize coeftab 6.183016e-02 s Time to factorize 1.571861e+00 s (12.90 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.453828e-02 s Time for refinement 4.333441e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.087415e-07 max(|| b_i - A x_i ||_1) 9.200869e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.321654e+00 (SUCCESS) Start 453: shm_example_simple_lap_c_facto0_sched0_kway_rqrcpend 292/3626 Test #460: shm_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpbegin .......***Timeout 204.58 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.829515e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.318626e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.695045e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 6.190743e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.893991e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.256485e-02 s Time to initialize coeftab 1.921787e+00 s Time to factorize 2.725709e+00 s ( 7.44 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.652626e-02 s - iteration 1 : total iteration time 0.0152 s error 5.208e-11 Time for refinement 2.176365e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.500726e-08 max(|| b_i - A x_i ||_1) 3.242960e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.182957e-01 (SUCCESS) Start 460: shm_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpbegin 292/3626 Test #463: shm_example_simple_lap_c_facto0_sched0_not_rqrrtend .....................***Timeout 204.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.598921e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.688772e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.185718e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.155873e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.718295e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.521505e-01 s Time to initialize coeftab 1.545157e-01 s Time to factorize 4.383233e+00 s ( 4.63 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.950939e-02 s Time for refinement 1.057218e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.087415e-07 max(|| b_i - A x_i ||_1) 9.200869e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.321654e+00 (SUCCESS) Start 463: shm_example_simple_lap_c_facto0_sched0_not_rqrrtend 292/3626 Test #473: shm_example_simple_lap_c_facto1_sched0_kway_svdend ......................***Timeout 204.55 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.904801e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.622739e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.442783e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.522818e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.997608e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.782644e-01 s Time to initialize coeftab 2.486629e-01 s Time to factorize 7.348687e+00 s ( 2.90 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.612525e-02 s Time for refinement 1.051151e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.091321e-07 max(|| b_i - A x_i ||_1) 8.898598e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.245383e+00 (SUCCESS) Start 473: shm_example_simple_lap_c_facto1_sched0_kway_svdend 292/3626 Test #474: shm_example_simple_lap_c_facto1_sched0_kwayprojections_svdbegin .........***Timeout 204.55 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.806917e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.103140e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.362044e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.596738e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.886930e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.385798e-01 s Time to initialize coeftab 2.746515e+00 s Time to factorize 2.142902e+00 s ( 9.94 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.583702e-02 s Time for refinement 4.446385e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.152375e-07 max(|| b_i - A x_i ||_1) 9.384565e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.368006e+00 (SUCCESS) Start 474: shm_example_simple_lap_c_facto1_sched0_kwayprojections_svdbegin 292/3626 Test #476: shm_example_simple_lap_c_facto1_sched0_not_pqrcpbegin ...................***Timeout 204.55 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.785179e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.411278e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.271850e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.287996e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.822385e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.416692e-03 s Time to initialize coeftab 1.141814e+00 s Time to factorize 6.700592e+00 s ( 3.18 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 4.740211e-01 s - iteration 1 : total iteration time 0.27 s error 2.0326e-11 Time for refinement 3.524762e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.308260e-08 max(|| b_i - A x_i ||_1) 3.117375e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.866070e-01 (SUCCESS) Start 476: shm_example_simple_lap_c_facto1_sched0_not_pqrcpbegin 292/3626 Test #478: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpbegin ..................***Timeout 204.54 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.663473e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.677124e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.170203e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.338529e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.846495e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.140553e-01 s Time to initialize coeftab 1.740111e+00 s Time to factorize 6.180353e+00 s ( 3.45 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 4.218453e-01 s - iteration 1 : total iteration time 0.424 s error 2.0326e-11 Time for refinement 5.380997e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.308260e-08 max(|| b_i - A x_i ||_1) 3.117375e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.866070e-01 (SUCCESS) Start 478: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpbegin 292/3626 Test #479: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpend ....................***Timeout 204.54 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.749203e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.138284e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.499152e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.887269e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.827550e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.125325e-01 s Time to initialize coeftab 1.455774e-01 s Time to factorize 5.210623e+00 s ( 4.09 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.780496e-01 s Time for refinement 2.862233e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.091321e-07 max(|| b_i - A x_i ||_1) 8.898598e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.245383e+00 (SUCCESS) Start 479: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpend 292/3626 Test #484: shm_example_simple_lap_c_facto1_sched0_kway_rqrcpbegin ..................***Timeout 204.52 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.615576e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.152734e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.778229e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.001950e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.673884e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.743760e-02 s Time to initialize coeftab 3.476320e+00 s Time to factorize 1.788702e+00 s (11.91 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.503223e-02 s - iteration 1 : total iteration time 0.0161 s error 5.1825e-11 Time for refinement 2.416118e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.458689e-08 max(|| b_i - A x_i ||_1) 3.230252e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.150892e-01 (SUCCESS) Start 484: shm_example_simple_lap_c_facto1_sched0_kway_rqrcpbegin 292/3626 Test #488: shm_example_simple_lap_c_facto1_sched0_not_tqrcpbegin ...................***Timeout 204.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.727619e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.878432e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.573799e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.383676e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.767145e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.062002e-02 s Time to initialize coeftab 3.395725e+00 s Time to factorize 3.372594e+00 s ( 6.32 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.553249e-02 s - iteration 1 : total iteration time 0.0149 s error 5.172e-11 Time for refinement 2.416064e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.467028e-08 max(|| b_i - A x_i ||_1) 3.233462e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.158991e-01 (SUCCESS) Start 488: shm_example_simple_lap_c_facto1_sched0_not_tqrcpbegin 292/3626 Test #490: shm_example_simple_lap_c_facto1_sched0_kway_tqrcpbegin ..................***Timeout 204.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.702714e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.100730e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.735367e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.400448e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.727574e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.094942e-03 s Time to initialize coeftab 2.493658e+00 s Time to factorize 4.721665e+00 s ( 4.51 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.636353e-02 s - iteration 1 : total iteration time 0.0304 s error 5.172e-11 Time for refinement 4.618517e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.467028e-08 max(|| b_i - A x_i ||_1) 3.233462e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.158991e-01 (SUCCESS) Start 490: shm_example_simple_lap_c_facto1_sched0_kway_tqrcpbegin 292/3626 Test #492: shm_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpbegin .......***Timeout 204.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.602398e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.765067e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.671083e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.631249e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.650764e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.504518e-03 s Time to initialize coeftab 3.671222e+00 s Time to factorize 2.700150e+00 s ( 7.89 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.787378e-02 s - iteration 1 : total iteration time 0.0144 s error 5.172e-11 Time for refinement 2.186736e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.467028e-08 max(|| b_i - A x_i ||_1) 3.233462e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.158991e-01 (SUCCESS) Start 492: shm_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpbegin 292/3626 Test #493: shm_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpend .........***Timeout 204.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.695126e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.728314e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.587119e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.767805e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.840922e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.713131e-01 s Time to initialize coeftab 1.184883e-01 s Time to factorize 9.141230e+00 s ( 2.33 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.058514e-02 s Time for refinement 7.380541e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.091321e-07 max(|| b_i - A x_i ||_1) 8.898598e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.245383e+00 (SUCCESS) Start 493: shm_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpend 292/3626 Test #494: shm_example_simple_lap_c_facto1_sched0_not_rqrrtbegin ...................***Timeout 204.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.709138e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.638178e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.705573e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.631181e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.981912e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.494947e-03 s Time to initialize coeftab 2.251058e+00 s Time to factorize 1.807184e+00 s (11.79 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.458620e-02 s - iteration 1 : total iteration time 0.0146 s error 5.1961e-11 Time for refinement 2.123031e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.866677e-08 max(|| b_i - A x_i ||_1) 3.344820e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.439981e-01 (SUCCESS) Start 494: shm_example_simple_lap_c_facto1_sched0_not_rqrrtbegin 292/3626 Test #502: shm_example_simple_lap_c_facto2_sched0_not_svdbegin .....................***Timeout 204.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.558929e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.660413e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.785228e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 5.061866e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.611440e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.433045e-03 s Time to initialize coeftab 2.516817e+00 s Time to factorize 5.106531e+00 s ( 7.83 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.546009e-02 s Time for refinement 4.245038e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.157403e-07 max(|| b_i - A x_i ||_1) 9.287170e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.343431e+00 (SUCCESS) Start 502: shm_example_simple_lap_c_facto2_sched0_not_svdbegin 292/3626 Test #508: shm_example_simple_lap_c_facto2_sched0_not_pqrcpbegin ...................***Timeout 204.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.606648e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.616335e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.657107e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 8.494369e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.769461e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.873478e-01 s Time to initialize coeftab 8.183270e-01 s Time to factorize 4.074423e+00 s ( 9.81 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.483615e-02 s - iteration 1 : total iteration time 0.0151 s error 2.0432e-11 Time for refinement 2.145348e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.307790e-08 max(|| b_i - A x_i ||_1) 3.124648e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.884421e-01 (SUCCESS) Start 508: shm_example_simple_lap_c_facto2_sched0_not_pqrcpbegin Start 733: shm_example_simple_lap_z_facto4_sched0_not_pqrcpend Start 734: shm_example_simple_lap_z_facto4_sched0_kway_pqrcpbegin Start 735: shm_example_simple_lap_z_facto4_sched0_kway_pqrcpend Start 736: shm_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpbegin Start 737: shm_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpend Start 738: shm_example_simple_lap_z_facto4_sched0_not_rqrcpbegin Start 739: shm_example_simple_lap_z_facto4_sched0_not_rqrcpend Start 740: shm_example_simple_lap_z_facto4_sched0_kway_rqrcpbegin Start 741: shm_example_simple_lap_z_facto4_sched0_kway_rqrcpend Start 742: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpbegin Start 743: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpend Start 744: shm_example_simple_lap_z_facto4_sched0_not_tqrcpbegin Start 745: shm_example_simple_lap_z_facto4_sched0_not_tqrcpend Start 746: shm_example_simple_lap_z_facto4_sched0_kway_tqrcpbegin Start 747: shm_example_simple_lap_z_facto4_sched0_kway_tqrcpend Start 748: shm_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpbegin Start 749: shm_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpend Start 750: shm_example_simple_lap_z_facto4_sched0_not_rqrrtbegin Start 751: shm_example_simple_lap_z_facto4_sched0_not_rqrrtend Start 752: shm_example_simple_lap_z_facto4_sched0_kway_rqrrtbegin Start 753: shm_example_simple_lap_z_facto4_sched0_kway_rqrrtend Start 754: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtbegin Start 755: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtend Start 756: shm_example_simple_lap_z_facto4_sched0_kway_pqrcpilu0 Start 757: shm_example_simple_lap_z_facto4_sched0_kway_pqrcpilu1 Start 758: shm_example_simple_lap_s_facto0_sched1_not_svdbegin Start 759: shm_example_simple_lap_s_facto0_sched1_not_svdend Start 760: shm_example_simple_lap_s_facto0_sched1_kway_svdbegin Start 761: shm_example_simple_lap_s_facto0_sched1_kway_svdend Start 762: shm_example_simple_lap_s_facto0_sched1_kwayprojections_svdbegin Start 763: shm_example_simple_lap_s_facto0_sched1_kwayprojections_svdend Start 764: shm_example_simple_lap_s_facto0_sched1_not_pqrcpbegin Start 765: shm_example_simple_lap_s_facto0_sched1_not_pqrcpend Start 766: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpbegin Start 767: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpend Start 768: shm_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpbegin Start 769: shm_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpend Start 770: shm_example_simple_lap_s_facto0_sched1_not_rqrcpbegin Start 771: shm_example_simple_lap_s_facto0_sched1_not_rqrcpend Start 772: shm_example_simple_lap_s_facto0_sched1_kway_rqrcpbegin Start 773: shm_example_simple_lap_s_facto0_sched1_kway_rqrcpend Start 774: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpbegin Start 775: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpend Start 776: shm_example_simple_lap_s_facto0_sched1_not_tqrcpbegin Start 777: shm_example_simple_lap_s_facto0_sched1_not_tqrcpend Start 778: shm_example_simple_lap_s_facto0_sched1_kway_tqrcpbegin Start 779: shm_example_simple_lap_s_facto0_sched1_kway_tqrcpend Start 780: shm_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpbegin Start 781: shm_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpend Start 782: shm_example_simple_lap_s_facto0_sched1_not_rqrrtbegin Start 783: shm_example_simple_lap_s_facto0_sched1_not_rqrrtend Start 784: shm_example_simple_lap_s_facto0_sched1_kway_rqrrtbegin Start 785: shm_example_simple_lap_s_facto0_sched1_kway_rqrrtend Start 786: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtbegin Start 787: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtend Start 788: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpilu0 Start 789: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpilu1 Start 790: shm_example_simple_lap_s_facto1_sched1_not_svdbegin Start 791: shm_example_simple_lap_s_facto1_sched1_not_svdend Start 792: shm_example_simple_lap_s_facto1_sched1_kway_svdbegin Start 793: shm_example_simple_lap_s_facto1_sched1_kway_svdend Start 794: shm_example_simple_lap_s_facto1_sched1_kwayprojections_svdbegin Start 795: shm_example_simple_lap_s_facto1_sched1_kwayprojections_svdend Start 796: shm_example_simple_lap_s_facto1_sched1_not_pqrcpbegin Start 797: shm_example_simple_lap_s_facto1_sched1_not_pqrcpend Start 798: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpbegin Start 799: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpend Start 800: shm_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpbegin Start 801: shm_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpend Start 802: shm_example_simple_lap_s_facto1_sched1_not_rqrcpbegin Start 803: shm_example_simple_lap_s_facto1_sched1_not_rqrcpend 292/3626 Test #527: shm_example_simple_lap_c_facto2_sched0_not_rqrrtend ..................... Passed 34.11 sec 293/3626 Test #534: shm_example_simple_lap_c_facto3_sched0_not_svdbegin ..................... Passed 31.21 sec 294/3626 Test #17: c_shm_example_simple_lap_s_facto0 .......................................***Timeout 209.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 17: c_shm_example_simple_lap_s_facto0 294/3626 Test #20: c_shm_example_simple_lap_d_facto0 .......................................***Timeout 209.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.583708e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.145888e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.557572e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 7.087909e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.991757e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.428584e-03 s Time to initialize coeftab 3.368776e-01 s Time to factorize 8.670402e+00 s (597.87 KFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.715330e+00 s Time for refinement 9.031954e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.648738e-16 max(|| b_i - A x_i ||_1) 1.920766e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.413607e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.332268e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.666611e-03 (SUCCESS) Start 20: c_shm_example_simple_lap_d_facto0 294/3626 Test #30: c_shm_example_simple_lap_z_facto2 .......................................***Timeout 209.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 30: c_shm_example_simple_lap_z_facto2 294/3626 Test #34: c_shm_example_simple_solve_and_refine_lap_s_facto1 ......................***Timeout 209.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.275841e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.785097e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.955237e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.919877e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.337538e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.244847e-03 s Time to initialize coeftab 2.021246e-01 s Time to factorize 1.691314e+00 s ( 3.09 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 7.425374e+00 s Time for refinement 1.164077e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.960578e-07 max(|| b_i - A x_i ||_1) 8.338820e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.047827e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 3.576279e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.158130e-01 (SUCCESS) Start 34: c_shm_example_simple_solve_and_refine_lap_s_facto1 294/3626 Test #46: c_shm_example_simple_solve_and_refine_lap_z_facto2 ......................***Timeout 208.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 46: c_shm_example_simple_solve_and_refine_lap_z_facto2 294/3626 Test #47: c_shm_example_simple_solve_and_refine_lap_z_facto3 ......................***Timeout 208.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.436324e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.717591e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.478235e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 8.467448e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.583674e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.499900e-01 s Time to initialize coeftab 3.205097e-02 s Time to factorize 2.487639e+00 s ( 8.15 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.595571e+00 s Time for refinement 5.083814e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.996428e-16 max(|| b_i - A x_i ||_1) 2.015844e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.086657e-03 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 1.539371e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.256394e-03 (SUCCESS) Start 47: c_shm_example_simple_solve_and_refine_lap_z_facto3 294/3626 Test #49: c_shm_example_simple_trans_lap_s_facto0 .................................***Timeout 208.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.440690e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.614391e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.437784e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.088658e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.477668e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.137663e-03 s Time to initialize coeftab 1.443212e-01 s Time to factorize 4.170996e+00 s ( 1.21 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 3.507667e+00 s Time for refinement 1.528036e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.064813e-07 max(|| b_i - A x_i ||_1) 9.021165e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.133568e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 3.874302e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.754641e-01 (SUCCESS) Start 49: c_shm_example_simple_trans_lap_s_facto0 294/3626 Test #54: c_shm_example_simple_trans_lap_d_facto2 .................................***Timeout 208.97 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.664454e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.460504e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.006716e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.187831e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.729966e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.351233e-03 s Time to initialize coeftab 2.721497e-02 s Time to factorize 2.740154e+00 s ( 3.64 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 3.837636e+00 s Time for refinement 1.002586e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.466254e-16 max(|| b_i - A x_i ||_1) 1.829653e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.299116e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.999958e-03 (SUCCESS) Start 54: c_shm_example_simple_trans_lap_d_facto2 294/3626 Test #65: c_shm_example_step-by-step_lap_s_facto0 .................................***Timeout 208.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.654754e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.618956e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.237048e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 7.398537e-01 s Time to initialize internal csc 9.223714e-03 s Time to initialize coeftab 5.572425e-01 s Time to factorize 1.590298e+00 s ( 3.18 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 2.183707e+00 s Time for refinement 6.008508e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.176381e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.348525e-07 max(|| b_i - A x_i ||_1) 9.779437e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.238706e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 6.854534e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.372929e+00 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 65: c_shm_example_step-by-step_lap_s_facto0 294/3626 Test #66: c_shm_example_step-by-step_lap_s_facto1 .................................***Timeout 208.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.450210e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.771517e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.914278e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.377254e-01 s Time to initialize internal csc 3.132974e-03 s Time to initialize coeftab 2.884213e-01 s Time to factorize 4.114244e+00 s ( 1.27 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 3.118493e+00 s Time for refinement 5.233372e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.176381e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.236073e-07 max(|| b_i - A x_i ||_1) 9.121127e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.171421e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 5.662441e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.135959e+00 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 66: c_shm_example_step-by-step_lap_s_facto1 294/3626 Test #67: c_shm_example_step-by-step_lap_s_facto2 .................................***Timeout 208.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.349138e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.418394e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.585675e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.267727e-01 s Time to initialize internal csc 3.119503e-03 s Time to initialize coeftab 2.826524e-01 s Time to factorize 2.257966e+00 s ( 4.42 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 4.098814e+00 s Time for refinement 7.364380e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.176381e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.227544e-07 max(|| b_i - A x_i ||_1) 8.967812e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.135903e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 6.556511e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.313237e+00 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 67: c_shm_example_step-by-step_lap_s_facto2 294/3626 Test #68: c_shm_example_step-by-step_lap_d_facto0 .................................***Timeout 208.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 68: c_shm_example_step-by-step_lap_d_facto0 294/3626 Test #69: c_shm_example_step-by-step_lap_d_facto1 .................................***Timeout 208.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.406561e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.264097e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.766701e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.574481e-01 s Time to initialize internal csc 3.263789e-03 s Time to initialize coeftab 1.419940e-01 s Time to factorize 1.696222e+00 s ( 3.09 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 3.094608e+00 s Time for refinement 7.244550e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.937682e-16 max(|| b_i - A x_i ||_1) 1.897884e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.437482e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.276756e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.555905e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 2.256185e+00 s Time for refinement 4.279777e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.982799e-16 max(|| b_i - A x_i ||_1) 1.910992e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.454318e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.165734e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.333653e-03 (SUCCESS) Start 69: c_shm_example_step-by-step_lap_d_facto1 294/3626 Test #70: c_shm_example_step-by-step_lap_d_facto2 .................................***Timeout 208.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 70: c_shm_example_step-by-step_lap_d_facto2 294/3626 Test #71: c_shm_example_step-by-step_lap_c_facto0 .................................***Timeout 208.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.822467e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.714135e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.016742e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.616647e-01 s Time to initialize internal csc 3.372583e-03 s Time to initialize coeftab 3.745467e-01 s Time to factorize 3.596040e+00 s ( 5.64 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 2.398816e+00 s Time for refinement 6.605250e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.289797e-07 max(|| b_i - A x_i ||_1) 9.413819e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.406375e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 5.997601e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 8.698131e-01 (SUCCESS) Start 71: c_shm_example_step-by-step_lap_c_facto0 294/3626 Test #72: c_shm_example_step-by-step_lap_c_facto1 .................................***Timeout 208.97 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.593786e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.125171e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.338387e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.763715e-01 s Time to initialize internal csc 3.430476e-03 s Time to initialize coeftab 1.301220e-01 s Time to factorize 3.545016e+00 s ( 6.01 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 4.065846e+00 s Time for refinement 5.180327e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029081e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.174228e-07 max(|| b_i - A x_i ||_1) 8.996947e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.268951e+00 (SUCCESS) max(|| x_i ||_oo) 7.029081e-01 max(|| x0_i - x_i ||_oo) 6.469576e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 9.382621e-01 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 72: c_shm_example_step-by-step_lap_c_facto1 294/3626 Test #73: c_shm_example_step-by-step_lap_c_facto2 .................................***Timeout 208.97 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 73: c_shm_example_step-by-step_lap_c_facto2 294/3626 Test #74: c_shm_example_step-by-step_lap_c_facto3 .................................***Timeout 208.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 74: c_shm_example_step-by-step_lap_c_facto3 294/3626 Test #75: c_shm_example_step-by-step_lap_c_facto4 .................................***Timeout 208.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.517231e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.624301e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.089678e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.416977e-01 s Time to initialize internal csc 3.414535e-03 s Time to initialize coeftab 2.735991e-01 s Time to factorize 3.164783e+00 s ( 6.73 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 3.765689e+00 s Time for refinement 6.341885e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029081e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.151999e-07 max(|| b_i - A x_i ||_1) 8.979774e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.262526e+00 (SUCCESS) max(|| x_i ||_oo) 7.029081e-01 max(|| x0_i - x_i ||_oo) 7.047793e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.022119e+00 (SUCCESS) Start 75: c_shm_example_step-by-step_lap_c_facto4 294/3626 Test #76: c_shm_example_step-by-step_lap_z_facto0 .................................***Timeout 209.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.793499e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.518886e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.522033e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 6.353889e-01 s Time to initialize internal csc 1.332912e-02 s Time to initialize coeftab 5.096013e-01 s Time to factorize 3.750216e+00 s ( 5.41 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 2.958195e+00 s Time for refinement 6.119965e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.044402e-16 max(|| b_i - A x_i ||_1) 2.025628e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.111346e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.640164e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.404135e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 76: c_shm_example_step-by-step_lap_z_facto0 294/3626 Test #77: c_shm_example_step-by-step_lap_z_facto1 .................................***Timeout 209.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.950313e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.611301e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.785532e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.371238e-02 s Time to initialize internal csc 2.014824e-02 s Time to initialize coeftab 4.579559e-02 s Time to factorize 3.721520e+00 s ( 5.73 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 3.053176e+00 s Time for refinement 5.737707e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.783202e-16 max(|| b_i - A x_i ||_1) 1.860390e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.739094e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.374392e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.014568e-03 (SUCCESS) Start 77: c_shm_example_step-by-step_lap_z_facto1 294/3626 Test #78: c_shm_example_step-by-step_lap_z_facto2 .................................***Timeout 209.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.886236e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.744146e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.681562e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 5.823114e-01 s Time to initialize internal csc 6.474806e-03 s Time to initialize coeftab 4.353738e-02 s Time to factorize 3.633157e+00 s (11.00 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 2.49 Mo Time to solve 3.675971e+00 s Time for refinement 5.456731e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.790810e-16 max(|| b_i - A x_i ||_1) 1.793778e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.617328e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.475229e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.098751e-03 (SUCCESS) Start 78: c_shm_example_step-by-step_lap_z_facto2 294/3626 Test #79: c_shm_example_step-by-step_lap_z_facto3 .................................***Timeout 209.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.675290e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.760336e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.538241e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.167449e-01 s Time to initialize internal csc 6.963895e-03 s Time to initialize coeftab 1.672228e-01 s Time to factorize 3.765351e+00 s ( 5.39 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 2.863446e+00 s Time for refinement 6.000879e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.181542e-16 max(|| b_i - A x_i ||_1) 2.027130e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.115137e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.516176e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.157005e-03 (SUCCESS) Start 79: c_shm_example_step-by-step_lap_z_facto3 294/3626 Test #80: c_shm_example_step-by-step_lap_z_facto4 .................................***Timeout 209.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.782002e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.458435e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.097030e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.404208e-01 s Time to initialize internal csc 6.435575e-03 s Time to initialize coeftab 1.553849e-01 s Time to factorize 3.547488e+00 s ( 6.01 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 2.325034e+00 s Time for refinement 5.326411e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.759680e-16 max(|| b_i - A x_i ||_1) 1.851785e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.721455e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.621268e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.376437e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 80: c_shm_example_step-by-step_lap_z_facto4 294/3626 Test #89: c_shm_example_personal_lap_s_facto0 .....................................***Timeout 208.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 89: c_shm_example_personal_lap_s_facto0 294/3626 Test #90: c_shm_example_personal_lap_s_facto1 .....................................***Timeout 208.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 90: c_shm_example_personal_lap_s_facto1 294/3626 Test #91: c_shm_example_personal_lap_s_facto2 .....................................***Timeout 209.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 91: c_shm_example_personal_lap_s_facto2 294/3626 Test #92: c_shm_example_personal_lap_d_facto0 .....................................***Timeout 209.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 92: c_shm_example_personal_lap_d_facto0 294/3626 Test #93: c_shm_example_personal_lap_d_facto1 .....................................***Timeout 209.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 93: c_shm_example_personal_lap_d_facto1 294/3626 Test #94: c_shm_example_personal_lap_d_facto2 .....................................***Timeout 209.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 94: c_shm_example_personal_lap_d_facto2 294/3626 Test #95: c_shm_example_personal_lap_c_facto0 .....................................***Timeout 209.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 95: c_shm_example_personal_lap_c_facto0 294/3626 Test #96: c_shm_example_personal_lap_c_facto1 .....................................***Timeout 209.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 96: c_shm_example_personal_lap_c_facto1 294/3626 Test #97: c_shm_example_personal_lap_c_facto2 .....................................***Timeout 209.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 97: c_shm_example_personal_lap_c_facto2 294/3626 Test #98: c_shm_example_personal_lap_c_facto3 .....................................***Timeout 209.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 98: c_shm_example_personal_lap_c_facto3 294/3626 Test #99: c_shm_example_personal_lap_c_facto4 .....................................***Timeout 209.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 99: c_shm_example_personal_lap_c_facto4 294/3626 Test #100: c_shm_example_personal_lap_z_facto0 .....................................***Timeout 209.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 100: c_shm_example_personal_lap_z_facto0 294/3626 Test #101: c_shm_example_personal_lap_z_facto1 .....................................***Timeout 209.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 101: c_shm_example_personal_lap_z_facto1 294/3626 Test #102: c_shm_example_personal_lap_z_facto2 .....................................***Timeout 209.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 102: c_shm_example_personal_lap_z_facto2 294/3626 Test #103: c_shm_example_personal_lap_z_facto3 .....................................***Timeout 209.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 103: c_shm_example_personal_lap_z_facto3 294/3626 Test #104: c_shm_example_personal_lap_z_facto4 .....................................***Timeout 209.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 104: c_shm_example_personal_lap_z_facto4 294/3626 Test #121: c_shm_example_simple_scotch_rsa .........................................***Timeout 209.00 sec RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Start 121: c_shm_example_simple_scotch_rsa 294/3626 Test #125: c_shm_example_simple_single_rsa .........................................***Timeout 208.99 sec RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Start 125: c_shm_example_simple_single_rsa 294/3626 Test #129: c_shm_example_step-by-step_single_rsa ...................................***Timeout 208.98 sec RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Start 129: c_shm_example_step-by-step_single_rsa 294/3626 Test #130: c_shm_example_step-by-step_single_mm ....................................***Timeout 208.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: IJV N: 841 nnz: 2465 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.177903e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 24350 Fill-in of L 9.878296 Time to compute symbol matrix 6.796948e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.227856e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 24350 Fill-in 9.878296 Number of operations in full-rank: LDL^t 3.58 MFlops Prediction: Model AMD 6180 MKL Time to factorize 2.058387e-04 s Time for mapping/scheduling 9.833693e-01 s Time to initialize internal csc 4.491137e-03 s Time to initialize coeftab 3.727305e-01 s Time to factorize 8.247811e-01 s ( 4.35 MFlop/s) Number of operations 4.26 MFlops Number of static pivots 0 Memory usage of coeftab 518 Ko Time to solve 3.294098e+00 s Time for refinement 6.112490e+00 s || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.286477e-15 max(|| b_i - A x_i ||_1) 1.443945e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.262666e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.611776e+00 s Time for refinement 4.292784e+00 s || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.591844e-15 max(|| b_i - A x_i ||_1) 1.413373e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.997513e-03 (SUCCESS) Start 130: c_shm_example_step-by-step_single_mm 294/3626 Test #131: c_shm_example_step-by-step_single_hb ....................................***Timeout 208.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 131: c_shm_example_step-by-step_single_hb 294/3626 Test #132: c_shm_example_step-by-step_single_mm2 ...................................***Timeout 209.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: IJV N: 1280 nnz: 12029 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.836888e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 10749 Fill-in of L 0.893590 Time to compute symbol matrix 7.867870e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.442423e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 21498 Fill-in 1.787181 Number of operations in full-rank: LU 1.08 MFlops Prediction: Model AMD 6180 MKL Time to factorize 8.513996e-04 s Time for mapping/scheduling 4.732968e-01 s Time to initialize internal csc 2.064839e-02 s Time to initialize coeftab 1.958545e-01 s Time to factorize 5.563580e-01 s ( 1.95 MFlop/s) Number of operations 1.20 MFlops Number of static pivots 0 Memory usage of coeftab 510 Ko Time to solve 2.544620e+00 s Time for refinement 1.041076e+01 s || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.158996e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415511e-04 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.669910e+00 s Time for refinement 4.733583e+00 s || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.158996e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415511e-04 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 132: c_shm_example_step-by-step_single_mm2 294/3626 Test #133: c_shm_example_simple_refine_cg ..........................................***Timeout 209.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Start 133: c_shm_example_simple_refine_cg 294/3626 Test #136: c_shm_example_refinement_lap_s_refine_cg_sym ............................***Timeout 209.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 136: c_shm_example_refinement_lap_s_refine_cg_sym 294/3626 Test #137: c_shm_example_refinement_lap_s_refine_gmres_sym .........................***Timeout 209.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 137: c_shm_example_refinement_lap_s_refine_gmres_sym 294/3626 Test #138: c_shm_example_refinement_lap_s_refine_bicgstab_sym ......................***Timeout 209.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 138: c_shm_example_refinement_lap_s_refine_bicgstab_sym 294/3626 Test #139: c_shm_example_refinement_lap_d_refine_cg_sym ............................***Timeout 209.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 139: c_shm_example_refinement_lap_d_refine_cg_sym 294/3626 Test #140: c_shm_example_refinement_lap_d_refine_gmres_sym .........................***Timeout 209.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 140: c_shm_example_refinement_lap_d_refine_gmres_sym 294/3626 Test #141: c_shm_example_refinement_lap_d_refine_bicgstab_sym ......................***Timeout 209.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 141: c_shm_example_refinement_lap_d_refine_bicgstab_sym 294/3626 Test #142: c_shm_example_refinement_lap_c_refine_cg_her ............................***Timeout 209.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex32 Format: CSC N: 1000 nnz: 11476 Start 142: c_shm_example_refinement_lap_c_refine_cg_her 294/3626 Test #143: c_shm_example_refinement_lap_c_refine_gmres_her .........................***Timeout 209.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex32 Format: CSC N: 1000 nnz: 11476 Start 143: c_shm_example_refinement_lap_c_refine_gmres_her 294/3626 Test #144: c_shm_example_refinement_lap_c_refine_bicgstab_her ......................***Timeout 209.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex32 Format: CSC N: 1000 nnz: 11476 Start 144: c_shm_example_refinement_lap_c_refine_bicgstab_her 294/3626 Test #145: c_shm_example_refinement_lap_c_refine_cg_sym ............................***Timeout 209.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.435253e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.901863e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.101769e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.112671e-01 s Time to initialize internal csc 6.567539e-03 s - iteration 1 : total iteration time 3.1 s error 0.20457 - iteration 2 : total iteration time 1.66 s error 0.058883 - iteration 3 : total iteration time 1.26 s error 0.018805 - iteration 4 : total iteration time 1.1 s error 0.0064705 - iteration 5 : total iteration time 0.991 s error 0.0022688 - iteration 6 : total iteration time 0.556 s error 0.00080218 - iteration 7 : total iteration time 1.17 s error 0.00027994 - iteration 8 : total iteration time 1.13 s error 9.2911e-05 - iteration 9 : total iteration time 0.751 s error 3.0814e-05 - iteration 10 : total iteration time 0.912 s error 1.0212e-05 - iteration 11 : total iteration time 0.821 s error 3.1309e-06 - iteration 12 : total iteration time 0.814 s error 9.4295e-07 Time for refinement 1.644888e+01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822265e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.526498e-07 max(|| b_i - A x_i ||_1) 5.490745e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.385479e+01 (SUCCESS) Start 145: c_shm_example_refinement_lap_c_refine_cg_sym 294/3626 Test #146: c_shm_example_refinement_lap_c_refine_gmres_sym .........................***Timeout 209.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 146: c_shm_example_refinement_lap_c_refine_gmres_sym 294/3626 Test #147: c_shm_example_refinement_lap_c_refine_bicgstab_sym ......................***Timeout 209.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 147: c_shm_example_refinement_lap_c_refine_bicgstab_sym 294/3626 Test #148: c_shm_example_refinement_lap_z_refine_cg_her ............................***Timeout 209.07 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 Start 148: c_shm_example_refinement_lap_z_refine_cg_her 294/3626 Test #149: c_shm_example_refinement_lap_z_refine_gmres_her .........................***Timeout 209.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 Start 149: c_shm_example_refinement_lap_z_refine_gmres_her 294/3626 Test #150: c_shm_example_refinement_lap_z_refine_bicgstab_her ......................***Timeout 209.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 Start 150: c_shm_example_refinement_lap_z_refine_bicgstab_her 294/3626 Test #151: c_shm_example_refinement_lap_z_refine_cg_sym ............................***Timeout 209.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 151: c_shm_example_refinement_lap_z_refine_cg_sym 294/3626 Test #152: c_shm_example_refinement_lap_z_refine_gmres_sym .........................***Timeout 209.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 152: c_shm_example_refinement_lap_z_refine_gmres_sym 294/3626 Test #153: c_shm_example_refinement_lap_z_refine_bicgstab_sym ......................***Timeout 209.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 153: c_shm_example_refinement_lap_z_refine_bicgstab_sym 294/3626 Test #154: c_shm_example_simple_mixed_refine_cg ....................................***Timeout 209.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Start 154: c_shm_example_simple_mixed_refine_cg 294/3626 Test #159: c_shm_example_simple_mixed_lap_d_facto2 .................................***Timeout 209.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.279506e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.821449e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.918324e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.769360e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.357886e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.133108e-02 s Time to initialize coeftab 3.328731e-01 s Time to factorize 2.149078e+00 s ( 4.65 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 7.010003e-01 s - iteration 1 : total iteration time 1.11 s error 5.2164e-14 Time for refinement 4.121973e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.217101e-14 max(|| b_i - A x_i ||_1) 1.587423e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.994734e-01 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.432188e-13 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.866607e-01 (SUCCESS) Start 159: c_shm_example_simple_mixed_lap_d_facto2 294/3626 Test #161: c_shm_example_simple_mixed_lap_d_refine_gmres_sym .......................***Timeout 209.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.654495e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.905788e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.181216e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.738760e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.692276e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.216261e-01 s Time to initialize coeftab 2.155233e-01 s Time to factorize 1.907473e+00 s ( 5.23 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.734526e+00 s - iteration 1 : total iteration time 1.77 s error 7.3788e-14 Time for refinement 3.556145e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.378838e-14 max(|| b_i - A x_i ||_1) 1.765276e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.218221e-01 (SUCCESS) Start 161: c_shm_example_simple_mixed_lap_d_refine_gmres_sym 294/3626 Test #164: c_shm_example_simple_mixed_lap_z_facto1 .................................***Timeout 209.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.606315e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.628631e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.501946e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.634965e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.674171e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.519117e-03 s Time to initialize coeftab 3.872594e-02 s Time to factorize 3.443109e+00 s ( 6.19 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 3.285000e+00 s - iteration 1 : total iteration time 2.32 s error 6.6843e-14 Time for refinement 4.264110e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.684303e-14 max(|| b_i - A x_i ||_1) 1.843072e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.650695e-01 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 2.693775e-13 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 3.948507e-01 (SUCCESS) Start 164: c_shm_example_simple_mixed_lap_z_facto1 294/3626 Test #167: c_shm_example_simple_mixed_lap_z_facto4 .................................***Timeout 209.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 167: c_shm_example_simple_mixed_lap_z_facto4 294/3626 Test #168: c_shm_example_simple_mixed_lap_z_refine_cg_her ..........................***Timeout 209.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 Start 168: c_shm_example_simple_mixed_lap_z_refine_cg_her 294/3626 Test #170: c_shm_example_simple_mixed_lap_z_refine_bicgstab_her ....................***Timeout 209.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 Start 170: c_shm_example_simple_mixed_lap_z_refine_bicgstab_her 294/3626 Test #173: c_shm_example_simple_mixed_lap_z_refine_bicgstab_sym ....................***Timeout 209.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 173: c_shm_example_simple_mixed_lap_z_refine_bicgstab_sym 294/3626 Test #191: shm_example_simple_lap_s_facto1_sched1_1d ...............................***Timeout 208.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.506757e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.713845e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.316516e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.828758e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.532824e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.013782e-01 s Time to initialize coeftab 1.689942e-01 s Time to factorize 2.015502e+00 s ( 2.60 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 1.247960e+00 s Time for refinement 1.490061e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.930618e-07 max(|| b_i - A x_i ||_1) 8.254857e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.037276e+00 (SUCCESS) Start 191: shm_example_simple_lap_s_facto1_sched1_1d 294/3626 Test #192: shm_example_simple_lap_s_facto2_sched1_1d ...............................***Timeout 209.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.370490e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.251349e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.012757e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.767173e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.724585e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.241734e-01 s Time to initialize coeftab 1.012965e-01 s Time to factorize 1.211958e+00 s ( 8.24 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 9.780289e-01 s Time for refinement 6.960996e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.928048e-07 max(|| b_i - A x_i ||_1) 8.220308e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.032935e+00 (SUCCESS) Start 192: shm_example_simple_lap_s_facto2_sched1_1d 294/3626 Test #199: shm_example_simple_lap_c_facto3_sched1_1d ...............................***Timeout 208.97 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.510081e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.261549e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.661583e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.838230e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.924081e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.564222e-01 s Time to initialize coeftab 5.827823e-02 s Time to factorize 2.592416e+00 s ( 7.82 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 8.166069e-01 s Time for refinement 1.230790e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.069478e-07 max(|| b_i - A x_i ||_1) 9.108871e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.298440e+00 (SUCCESS) Start 199: shm_example_simple_lap_c_facto3_sched1_1d 294/3626 Test #206: shm_example_simple_lap_s_facto0_sched4_1d ...............................***Timeout 208.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 206: shm_example_simple_lap_s_facto0_sched4_1d 294/3626 Test #207: shm_example_simple_lap_s_facto1_sched4_1d ...............................***Timeout 208.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.604876e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.281501e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.939252e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.315906e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.680130e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.072711e-03 s Time to initialize coeftab 2.102103e-01 s Time to factorize 1.692264e+00 s ( 3.09 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 1.973628e+00 s Time for refinement 1.231275e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.971899e-07 max(|| b_i - A x_i ||_1) 8.425242e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.058686e+00 (SUCCESS) Start 207: shm_example_simple_lap_s_facto1_sched4_1d 294/3626 Test #212: shm_example_simple_lap_c_facto0_sched4_1d ...............................***Timeout 208.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.558864e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.928684e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.588413e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.501499e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.585520e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.556557e-03 s Time to initialize coeftab 3.804688e-02 s Time to factorize 4.028973e+00 s ( 5.03 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 2.000617e+00 s Time for refinement 1.079143e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.056074e-07 max(|| b_i - A x_i ||_1) 9.139283e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.306114e+00 (SUCCESS) Start 212: shm_example_simple_lap_c_facto0_sched4_1d 294/3626 Test #217: shm_example_simple_lap_z_facto0_sched4_1d ...............................***Timeout 208.92 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 217: shm_example_simple_lap_z_facto0_sched4_1d 294/3626 Test #221: shm_example_simple_lap_z_facto4_sched4_1d ...............................***Timeout 208.90 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.495685e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.920722e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.360880e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.950917e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.527117e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.833604e-02 s Time to initialize coeftab 6.924949e-02 s Time to factorize 3.303751e+00 s ( 6.45 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 2.258645e+00 s Time for refinement 5.194359e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.850859e-16 max(|| b_i - A x_i ||_1) 1.872101e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.723945e-03 (SUCCESS) Start 221: shm_example_simple_lap_z_facto4_sched4_1d 294/3626 Test #248: shm_example_simple_lap_s_facto0_sched0_kway_svdbegin ....................***Timeout 208.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 248: shm_example_simple_lap_s_facto0_sched0_kway_svdbegin 294/3626 Test #278: shm_example_simple_lap_s_facto1_sched0_not_svdbegin .....................***Timeout 208.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.567342e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.451726e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.903505e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.171351e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.703451e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.686138e-01 s Time to initialize coeftab 1.267999e+00 s Time to factorize 2.524509e+00 s ( 2.07 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 5.808550e-03 s Time for refinement 3.126003e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.249276e-07 max(|| b_i - A x_i ||_1) 9.406115e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.181939e+00 (SUCCESS) Start 278: shm_example_simple_lap_s_facto1_sched0_not_svdbegin 294/3626 Test #284: shm_example_simple_lap_s_facto1_sched0_not_pqrcpbegin ...................***Timeout 208.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 284: shm_example_simple_lap_s_facto1_sched0_not_pqrcpbegin 294/3626 Test #290: shm_example_simple_lap_s_facto1_sched0_not_rqrcpbegin ...................***Timeout 208.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 290: shm_example_simple_lap_s_facto1_sched0_not_rqrcpbegin 294/3626 Test #295: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpend .........***Timeout 208.16 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 295: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpend 294/3626 Test #298: shm_example_simple_lap_s_facto1_sched0_kway_tqrcpbegin ..................***Timeout 208.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.718979e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.886391e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.787072e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.077272e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.735756e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.283379e-03 s Time to initialize coeftab 1.223125e+00 s Time to factorize 4.384591e+00 s ( 1.19 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 3.881045e-02 s - iteration 1 : total iteration time 0.0635 s error 3.3727e-11 Time for refinement 9.413339e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.806737e-08 max(|| b_i - A x_i ||_1) 2.878308e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.616781e-01 (SUCCESS) Start 298: shm_example_simple_lap_s_facto1_sched0_kway_tqrcpbegin 294/3626 Test #308: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpilu0 ...................***Timeout 208.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 308: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpilu0 294/3626 Test #310: shm_example_simple_lap_s_facto2_sched0_not_svdbegin .....................***Timeout 208.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 310: shm_example_simple_lap_s_facto2_sched0_not_svdbegin 294/3626 Test #311: shm_example_simple_lap_s_facto2_sched0_not_svdend .......................***Timeout 208.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.515673e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.767937e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.643383e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.934950e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.567713e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.013297e-03 s Time to initialize coeftab 9.514126e-02 s Time to factorize 3.218893e-01 s (31.02 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 1.143407e-02 s Time for refinement 6.133492e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.893975e-07 max(|| b_i - A x_i ||_1) 8.091629e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.016766e+00 (SUCCESS) Start 311: shm_example_simple_lap_s_facto2_sched0_not_svdend 294/3626 Test #315: shm_example_simple_lap_s_facto2_sched0_kwayprojections_svdend ...........***Timeout 208.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 315: shm_example_simple_lap_s_facto2_sched0_kwayprojections_svdend 294/3626 Test #316: shm_example_simple_lap_s_facto2_sched0_not_pqrcpbegin ...................***Timeout 208.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 316: shm_example_simple_lap_s_facto2_sched0_not_pqrcpbegin 294/3626 Test #322: shm_example_simple_lap_s_facto2_sched0_not_rqrcpbegin ...................***Timeout 208.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 322: shm_example_simple_lap_s_facto2_sched0_not_rqrcpbegin 294/3626 Test #325: shm_example_simple_lap_s_facto2_sched0_kway_rqrcpend ....................***Timeout 208.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 325: shm_example_simple_lap_s_facto2_sched0_kway_rqrcpend 294/3626 Test #333: shm_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpend .........***Timeout 207.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.658014e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.305875e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.672763e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.129400e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.736383e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.805595e-01 s Time to initialize coeftab 4.027732e-01 s Time to factorize 3.410005e+00 s ( 2.93 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 1.136744e-02 s Time for refinement 6.209425e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.893975e-07 max(|| b_i - A x_i ||_1) 8.091629e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.016766e+00 (SUCCESS) Start 333: shm_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpend 294/3626 Test #335: shm_example_simple_lap_s_facto2_sched0_not_rqrrtend .....................***Timeout 207.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.589417e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.997987e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.205860e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.462116e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.627209e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.243857e-03 s Time to initialize coeftab 1.185646e-01 s Time to factorize 3.258128e-01 s (30.65 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 1.130885e-02 s Time for refinement 9.343689e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.893975e-07 max(|| b_i - A x_i ||_1) 8.091629e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.016766e+00 (SUCCESS) Start 335: shm_example_simple_lap_s_facto2_sched0_not_rqrrtend 294/3626 Test #373: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpilu1 ...................***Timeout 207.66 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 373: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpilu1 294/3626 Test #374: shm_example_simple_lap_d_facto1_sched0_not_svdbegin .....................***Timeout 207.66 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.578207e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.535781e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.175537e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.308526e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.706108e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.399794e-03 s Time to initialize coeftab 9.414689e-01 s Time to factorize 7.419976e+00 s (722.25 KFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 6.781702e-02 s - iteration 1 : total iteration time 0.0221 s error 1.5585e-14 Time for refinement 6.474429e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.558197e-14 max(|| b_i - A x_i ||_1) 2.715471e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412222e-02 (SUCCESS) Start 374: shm_example_simple_lap_d_facto1_sched0_not_svdbegin 294/3626 Test #381: shm_example_simple_lap_d_facto1_sched0_not_pqrcpend .....................***Timeout 207.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 381: shm_example_simple_lap_d_facto1_sched0_not_pqrcpend 294/3626 Test #387: shm_example_simple_lap_d_facto1_sched0_not_rqrcpend .....................***Timeout 207.58 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 387: shm_example_simple_lap_d_facto1_sched0_not_rqrcpend 294/3626 Test #395: shm_example_simple_lap_d_facto1_sched0_kway_tqrcpend ....................***Timeout 207.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 395: shm_example_simple_lap_d_facto1_sched0_kway_tqrcpend 294/3626 Test #406: shm_example_simple_lap_d_facto2_sched0_not_svdbegin .....................***Timeout 207.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.428752e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.314149e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.257360e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.389089e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.633960e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.160528e-01 s Time to initialize coeftab 7.305645e-01 s Time to factorize 1.106870e+00 s ( 9.02 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 7.855490e-03 s - iteration 1 : total iteration time 0.00464 s error 1.5585e-14 Time for refinement 8.725075e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.558540e-14 max(|| b_i - A x_i ||_1) 2.717472e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.414737e-02 (SUCCESS) Start 406: shm_example_simple_lap_d_facto2_sched0_not_svdbegin 294/3626 Test #410: shm_example_simple_lap_d_facto2_sched0_kwayprojections_svdbegin .........***Timeout 207.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 410: shm_example_simple_lap_d_facto2_sched0_kwayprojections_svdbegin 294/3626 Test #415: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpend ....................***Timeout 207.36 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 415: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpend 294/3626 Test #418: shm_example_simple_lap_d_facto2_sched0_not_rqrcpbegin ...................***Timeout 207.35 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.644210e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.728696e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.805660e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 9.463258e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.655982e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.310481e-03 s Time to initialize coeftab 1.381146e+00 s Time to factorize 1.219432e+00 s ( 8.19 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 4.983097e-03 s - iteration 1 : total iteration time 0.00394 s error 5.5242e-14 Time for refinement 8.028727e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.524210e-14 max(|| b_i - A x_i ||_1) 6.118383e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.688275e-02 (SUCCESS) Start 418: shm_example_simple_lap_d_facto2_sched0_not_rqrcpbegin 294/3626 Test #430: shm_example_simple_lap_d_facto2_sched0_not_rqrrtbegin ...................***Timeout 207.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 430: shm_example_simple_lap_d_facto2_sched0_not_rqrrtbegin 294/3626 Test #434: shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtbegin .......***Timeout 207.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.435029e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.575207e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.267637e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.634702e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.669082e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.845262e-01 s Time to initialize coeftab 1.631337e+00 s Time to factorize 2.394248e+00 s ( 4.17 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 5.287239e-03 s - iteration 1 : total iteration time 0.00394 s error 1.008e-12 - iteration 2 : total iteration time 0.00331 s error 4.2236e-18 Time for refinement 1.507707e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.255520e-16 max(|| b_i - A x_i ||_1) 6.250679e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.854516e-04 (SUCCESS) Start 434: shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtbegin 294/3626 Test #438: shm_example_simple_lap_c_facto0_sched0_not_svdbegin .....................***Timeout 207.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 438: shm_example_simple_lap_c_facto0_sched0_not_svdbegin 294/3626 Test #439: shm_example_simple_lap_c_facto0_sched0_not_svdend .......................***Timeout 207.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 439: shm_example_simple_lap_c_facto0_sched0_not_svdend 294/3626 Test #440: shm_example_simple_lap_c_facto0_sched0_kway_svdbegin ....................***Timeout 207.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 440: shm_example_simple_lap_c_facto0_sched0_kway_svdbegin 294/3626 Test #442: shm_example_simple_lap_c_facto0_sched0_kwayprojections_svdbegin .........***Timeout 207.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 442: shm_example_simple_lap_c_facto0_sched0_kwayprojections_svdbegin 294/3626 Test #443: shm_example_simple_lap_c_facto0_sched0_kwayprojections_svdend ...........***Timeout 207.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 443: shm_example_simple_lap_c_facto0_sched0_kwayprojections_svdend 294/3626 Test #447: shm_example_simple_lap_c_facto0_sched0_kway_pqrcpend ....................***Timeout 207.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 447: shm_example_simple_lap_c_facto0_sched0_kway_pqrcpend 294/3626 Test #451: shm_example_simple_lap_c_facto0_sched0_not_rqrcpend .....................***Timeout 207.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 451: shm_example_simple_lap_c_facto0_sched0_not_rqrcpend 294/3626 Test #452: shm_example_simple_lap_c_facto0_sched0_kway_rqrcpbegin ..................***Timeout 207.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 452: shm_example_simple_lap_c_facto0_sched0_kway_rqrcpbegin 294/3626 Test #459: shm_example_simple_lap_c_facto0_sched0_kway_tqrcpend ....................***Timeout 207.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 459: shm_example_simple_lap_c_facto0_sched0_kway_tqrcpend 294/3626 Test #465: shm_example_simple_lap_c_facto0_sched0_kway_rqrrtend ....................***Timeout 207.07 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 465: shm_example_simple_lap_c_facto0_sched0_kway_rqrrtend 294/3626 Test #466: shm_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtbegin .......***Timeout 207.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 466: shm_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtbegin 294/3626 Test #470: shm_example_simple_lap_c_facto1_sched0_not_svdbegin .....................***Timeout 207.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 470: shm_example_simple_lap_c_facto1_sched0_not_svdbegin 294/3626 Test #472: shm_example_simple_lap_c_facto1_sched0_kway_svdbegin ....................***Timeout 207.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 472: shm_example_simple_lap_c_facto1_sched0_kway_svdbegin 294/3626 Test #482: shm_example_simple_lap_c_facto1_sched0_not_rqrcpbegin ...................***Timeout 206.97 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 482: shm_example_simple_lap_c_facto1_sched0_not_rqrcpbegin 294/3626 Test #489: shm_example_simple_lap_c_facto1_sched0_not_tqrcpend .....................***Timeout 206.92 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 489: shm_example_simple_lap_c_facto1_sched0_not_tqrcpend 294/3626 Test #498: shm_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtbegin .......***Timeout 206.81 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.683711e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.112723e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.075638e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.408720e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.194438e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.160930e-01 s Time to initialize coeftab 9.352970e-01 s Time to factorize 1.428009e+00 s (14.92 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.408241e-02 s - iteration 1 : total iteration time 0.0147 s error 5.1961e-11 Time for refinement 2.355013e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.866677e-08 max(|| b_i - A x_i ||_1) 3.344820e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.439981e-01 (SUCCESS) Start 498: shm_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtbegin 294/3626 Test #500: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpilu0 ...................***Timeout 206.54 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 500: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpilu0 294/3626 Test #503: shm_example_simple_lap_c_facto2_sched0_not_svdend .......................***Timeout 206.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 503: shm_example_simple_lap_c_facto2_sched0_not_svdend 294/3626 Test #504: shm_example_simple_lap_c_facto2_sched0_kway_svdbegin ....................***Timeout 206.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 504: shm_example_simple_lap_c_facto2_sched0_kway_svdbegin 294/3626 Test #505: shm_example_simple_lap_c_facto2_sched0_kway_svdend ......................***Timeout 206.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 505: shm_example_simple_lap_c_facto2_sched0_kway_svdend 294/3626 Test #506: shm_example_simple_lap_c_facto2_sched0_kwayprojections_svdbegin .........***Timeout 206.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.484030e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.905522e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.806869e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.503016e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.642116e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.194635e-03 s Time to initialize coeftab 2.304272e+00 s Time to factorize 1.067381e+01 s ( 3.74 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.492664e-02 s Time for refinement 4.334401e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.157403e-07 max(|| b_i - A x_i ||_1) 9.287170e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.343431e+00 (SUCCESS) Start 506: shm_example_simple_lap_c_facto2_sched0_kwayprojections_svdbegin 294/3626 Test #509: shm_example_simple_lap_c_facto2_sched0_not_pqrcpend .....................***Timeout 206.46 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.487110e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.833311e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.078422e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.707415e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.564593e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.262623e-02 s Time to initialize coeftab 1.373955e-01 s Time to factorize 1.779452e+00 s (22.46 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 5.041076e-02 s Time for refinement 1.339235e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.031245e-07 max(|| b_i - A x_i ||_1) 8.570109e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162495e+00 (SUCCESS) Start 509: shm_example_simple_lap_c_facto2_sched0_not_pqrcpend 294/3626 Test #510: shm_example_simple_lap_c_facto2_sched0_kway_pqrcpbegin ..................***Timeout 206.45 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 510: shm_example_simple_lap_c_facto2_sched0_kway_pqrcpbegin 294/3626 Test #516: shm_example_simple_lap_c_facto2_sched0_kway_rqrcpbegin .................. Passed 41.16 sec 295/3626 Test #519: shm_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpend ......... Passed 38.58 sec Start 804: shm_example_simple_lap_s_facto1_sched1_kway_rqrcpbegin Start 805: shm_example_simple_lap_s_facto1_sched1_kway_rqrcpend Start 806: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpbegin Start 807: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpend 296/3626 Test #539: shm_example_simple_lap_c_facto3_sched0_kwayprojections_svdend ........... Passed 30.70 sec 297/3626 Test #540: shm_example_simple_lap_c_facto3_sched0_not_pqrcpbegin ................... Passed 30.57 sec 298/3626 Test #538: shm_example_simple_lap_c_facto3_sched0_kwayprojections_svdbegin ......... Passed 30.89 sec 299/3626 Test #533: shm_example_simple_lap_c_facto2_sched0_kway_pqrcpilu1 ................... Passed 33.33 sec 300/3626 Test #514: shm_example_simple_lap_c_facto2_sched0_not_rqrcpbegin ................... Passed 41.90 sec 301/3626 Test #524: shm_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpbegin ....... Passed 36.69 sec 302/3626 Test #536: shm_example_simple_lap_c_facto3_sched0_kway_svdbegin .................... Passed 31.57 sec 303/3626 Test #520: shm_example_simple_lap_c_facto2_sched0_not_tqrcpbegin ................... Passed 38.63 sec 304/3626 Test #523: shm_example_simple_lap_c_facto2_sched0_kway_tqrcpend .................... Passed 37.56 sec 305/3626 Test #515: shm_example_simple_lap_c_facto2_sched0_not_rqrcpend ..................... Passed 41.47 sec 306/3626 Test #531: shm_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtend ......... Passed 34.32 sec 307/3626 Test #529: shm_example_simple_lap_c_facto2_sched0_kway_rqrrtend .................... Passed 35.32 sec 308/3626 Test #535: shm_example_simple_lap_c_facto3_sched0_not_svdend ....................... Passed 32.81 sec 309/3626 Test #521: shm_example_simple_lap_c_facto2_sched0_not_tqrcpend ..................... Passed 38.44 sec 310/3626 Test #513: shm_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpend ......... Passed 42.70 sec 311/3626 Test #526: shm_example_simple_lap_c_facto2_sched0_not_rqrrtbegin ................... Passed 36.54 sec 312/3626 Test #532: shm_example_simple_lap_c_facto2_sched0_kway_pqrcpilu0 ................... Passed 33.78 sec 313/3626 Test #525: shm_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpend ......... Passed 36.58 sec 314/3626 Test #537: shm_example_simple_lap_c_facto3_sched0_kway_svdend ...................... Passed 31.34 sec 315/3626 Test #517: shm_example_simple_lap_c_facto2_sched0_kway_rqrcpend .................... Passed 40.64 sec 316/3626 Test #528: shm_example_simple_lap_c_facto2_sched0_kway_rqrrtbegin .................. Passed 35.67 sec 317/3626 Test #530: shm_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtbegin ....... Passed 34.60 sec 318/3626 Test #522: shm_example_simple_lap_c_facto2_sched0_kway_tqrcpbegin .................. Passed 38.33 sec 319/3626 Test #518: shm_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpbegin ....... Passed 40.43 sec Start 808: shm_example_simple_lap_s_facto1_sched1_not_tqrcpbegin Start 809: shm_example_simple_lap_s_facto1_sched1_not_tqrcpend Start 810: shm_example_simple_lap_s_facto1_sched1_kway_tqrcpbegin Start 811: shm_example_simple_lap_s_facto1_sched1_kway_tqrcpend Start 812: shm_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpbegin Start 813: shm_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpend Start 814: shm_example_simple_lap_s_facto1_sched1_not_rqrrtbegin Start 815: shm_example_simple_lap_s_facto1_sched1_not_rqrrtend Start 816: shm_example_simple_lap_s_facto1_sched1_kway_rqrrtbegin Start 817: shm_example_simple_lap_s_facto1_sched1_kway_rqrrtend Start 818: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtbegin Start 819: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtend Start 820: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpilu0 Start 821: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpilu1 Start 822: shm_example_simple_lap_s_facto2_sched1_not_svdbegin Start 823: shm_example_simple_lap_s_facto2_sched1_not_svdend Start 824: shm_example_simple_lap_s_facto2_sched1_kway_svdbegin Start 825: shm_example_simple_lap_s_facto2_sched1_kway_svdend Start 826: shm_example_simple_lap_s_facto2_sched1_kwayprojections_svdbegin Start 827: shm_example_simple_lap_s_facto2_sched1_kwayprojections_svdend Start 828: shm_example_simple_lap_s_facto2_sched1_not_pqrcpbegin Start 829: shm_example_simple_lap_s_facto2_sched1_not_pqrcpend Start 830: shm_example_simple_lap_s_facto2_sched1_kway_pqrcpbegin Start 831: shm_example_simple_lap_s_facto2_sched1_kway_pqrcpend 320/3626 Test #542: shm_example_simple_lap_c_facto3_sched0_kway_pqrcpbegin .................. Passed 58.59 sec Start 832: shm_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpbegin 321/3626 Test #541: shm_example_simple_lap_c_facto3_sched0_not_pqrcpend ..................... Passed 87.83 sec Start 833: shm_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpend 322/3626 Test #546: shm_example_simple_lap_c_facto3_sched0_not_rqrcpbegin ................... Passed 86.29 sec Start 834: shm_example_simple_lap_s_facto2_sched1_not_rqrcpbegin 323/3626 Test #543: shm_example_simple_lap_c_facto3_sched0_kway_pqrcpend .................... Passed 87.08 sec Start 835: shm_example_simple_lap_s_facto2_sched1_not_rqrcpend 324/3626 Test #548: shm_example_simple_lap_c_facto3_sched0_kway_rqrcpbegin .................. Passed 87.24 sec Start 836: shm_example_simple_lap_s_facto2_sched1_kway_rqrcpbegin 325/3626 Test #549: shm_example_simple_lap_c_facto3_sched0_kway_rqrcpend .................... Passed 87.73 sec Start 837: shm_example_simple_lap_s_facto2_sched1_kway_rqrcpend 326/3626 Test #545: shm_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpend ......... Passed 89.46 sec Start 838: shm_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpbegin 327/3626 Test #551: shm_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpend ......... Passed 88.45 sec Start 839: shm_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpend 328/3626 Test #550: shm_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpbegin ....... Passed 90.14 sec Start 840: shm_example_simple_lap_s_facto2_sched1_not_tqrcpbegin 329/3626 Test #553: shm_example_simple_lap_c_facto3_sched0_not_tqrcpend ..................... Passed 94.32 sec Start 841: shm_example_simple_lap_s_facto2_sched1_not_tqrcpend 330/3626 Test #544: shm_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpbegin ....... Passed 97.89 sec Start 842: shm_example_simple_lap_s_facto2_sched1_kway_tqrcpbegin 331/3626 Test #552: shm_example_simple_lap_c_facto3_sched0_not_tqrcpbegin ................... Passed 104.46 sec Start 843: shm_example_simple_lap_s_facto2_sched1_kway_tqrcpend 332/3626 Test #558: shm_example_simple_lap_c_facto3_sched0_not_rqrrtbegin ................... Passed 104.62 sec Start 844: shm_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpbegin 333/3626 Test #557: shm_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpend ......... Passed 106.62 sec Start 845: shm_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpend 334/3626 Test #559: shm_example_simple_lap_c_facto3_sched0_not_rqrrtend ..................... Passed 106.43 sec Start 846: shm_example_simple_lap_s_facto2_sched1_not_rqrrtbegin 335/3626 Test #560: shm_example_simple_lap_c_facto3_sched0_kway_rqrrtbegin .................. Passed 109.52 sec Start 847: shm_example_simple_lap_s_facto2_sched1_not_rqrrtend 336/3626 Test #563: shm_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtend ......... Passed 109.68 sec Start 848: shm_example_simple_lap_s_facto2_sched1_kway_rqrrtbegin 337/3626 Test #564: shm_example_simple_lap_c_facto3_sched0_kway_pqrcpilu0 ................... Passed 109.93 sec Start 849: shm_example_simple_lap_s_facto2_sched1_kway_rqrrtend 338/3626 Test #565: shm_example_simple_lap_c_facto3_sched0_kway_pqrcpilu1 ................... Passed 110.03 sec Start 850: shm_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtbegin 339/3626 Test #574: shm_example_simple_lap_c_facto4_sched0_kway_pqrcpbegin .................. Passed 109.94 sec Start 851: shm_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtend 340/3626 Test #567: shm_example_simple_lap_c_facto4_sched0_not_svdend ....................... Passed 110.99 sec Start 852: shm_example_simple_lap_s_facto2_sched1_kway_pqrcpilu0 341/3626 Test #569: shm_example_simple_lap_c_facto4_sched0_kway_svdend ...................... Passed 110.84 sec Start 853: shm_example_simple_lap_s_facto2_sched1_kway_pqrcpilu1 342/3626 Test #547: shm_example_simple_lap_c_facto3_sched0_not_rqrcpend ..................... Passed 113.86 sec Start 854: shm_example_simple_lap_d_facto0_sched1_not_svdbegin 343/3626 Test #576: shm_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpbegin ....... Passed 110.76 sec Start 855: shm_example_simple_lap_d_facto0_sched1_not_svdend 344/3626 Test #573: shm_example_simple_lap_c_facto4_sched0_not_pqrcpend ..................... Passed 110.89 sec Start 856: shm_example_simple_lap_d_facto0_sched1_kway_svdbegin 345/3626 Test #572: shm_example_simple_lap_c_facto4_sched0_not_pqrcpbegin ................... Passed 111.42 sec Start 857: shm_example_simple_lap_d_facto0_sched1_kway_svdend 346/3626 Test #554: shm_example_simple_lap_c_facto3_sched0_kway_tqrcpbegin .................. Passed 114.19 sec Start 858: shm_example_simple_lap_d_facto0_sched1_kwayprojections_svdbegin 347/3626 Test #566: shm_example_simple_lap_c_facto4_sched0_not_svdbegin ..................... Passed 113.86 sec Start 859: shm_example_simple_lap_d_facto0_sched1_kwayprojections_svdend 348/3626 Test #570: shm_example_simple_lap_c_facto4_sched0_kwayprojections_svdbegin ......... Passed 113.94 sec Start 860: shm_example_simple_lap_d_facto0_sched1_not_pqrcpbegin 349/3626 Test #582: shm_example_simple_lap_c_facto4_sched0_kwayprojections_rqrcpbegin ....... Passed 115.90 sec Start 861: shm_example_simple_lap_d_facto0_sched1_not_pqrcpend 350/3626 Test #571: shm_example_simple_lap_c_facto4_sched0_kwayprojections_svdend ........... Passed 117.32 sec Start 862: shm_example_simple_lap_d_facto0_sched1_kway_pqrcpbegin 351/3626 Test #584: shm_example_simple_lap_c_facto4_sched0_not_tqrcpbegin ................... Passed 117.47 sec Start 863: shm_example_simple_lap_d_facto0_sched1_kway_pqrcpend 352/3626 Test #583: shm_example_simple_lap_c_facto4_sched0_kwayprojections_rqrcpend ......... Passed 117.61 sec Start 864: shm_example_simple_lap_d_facto0_sched1_kwayprojections_pqrcpbegin 353/3626 Test #587: shm_example_simple_lap_c_facto4_sched0_kway_tqrcpend .................... Passed 118.96 sec Start 865: shm_example_simple_lap_d_facto0_sched1_kwayprojections_pqrcpend 354/3626 Test #577: shm_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpend ......... Passed 119.65 sec Start 866: shm_example_simple_lap_d_facto0_sched1_not_rqrcpbegin 355/3626 Test #581: shm_example_simple_lap_c_facto4_sched0_kway_rqrcpend .................... Passed 121.69 sec Start 867: shm_example_simple_lap_d_facto0_sched1_not_rqrcpend 356/3626 Test #562: shm_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtbegin ....... Passed 123.65 sec Start 868: shm_example_simple_lap_d_facto0_sched1_kway_rqrcpbegin 357/3626 Test #586: shm_example_simple_lap_c_facto4_sched0_kway_tqrcpbegin .................. Passed 122.37 sec Start 869: shm_example_simple_lap_d_facto0_sched1_kway_rqrcpend 358/3626 Test #575: shm_example_simple_lap_c_facto4_sched0_kway_pqrcpend .................... Passed 123.25 sec Start 870: shm_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpbegin 359/3626 Test #593: shm_example_simple_lap_c_facto4_sched0_kway_rqrrtend .................... Passed 122.22 sec Start 871: shm_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpend 360/3626 Test #596: shm_example_simple_lap_c_facto4_sched0_kway_pqrcpilu0 ................... Passed 121.98 sec Start 872: shm_example_simple_lap_d_facto0_sched1_not_tqrcpbegin 361/3626 Test #591: shm_example_simple_lap_c_facto4_sched0_not_rqrrtend ..................... Passed 123.12 sec Start 873: shm_example_simple_lap_d_facto0_sched1_not_tqrcpend 362/3626 Test #585: shm_example_simple_lap_c_facto4_sched0_not_tqrcpend ..................... Passed 123.64 sec Start 874: shm_example_simple_lap_d_facto0_sched1_kway_tqrcpbegin 363/3626 Test #578: shm_example_simple_lap_c_facto4_sched0_not_rqrcpbegin ................... Passed 124.60 sec Start 875: shm_example_simple_lap_d_facto0_sched1_kway_tqrcpend 364/3626 Test #594: shm_example_simple_lap_c_facto4_sched0_kwayprojections_rqrrtbegin ....... Passed 123.92 sec Start 876: shm_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpbegin 365/3626 Test #588: shm_example_simple_lap_c_facto4_sched0_kwayprojections_tqrcpbegin ....... Passed 124.31 sec Start 877: shm_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpend 366/3626 Test #603: shm_example_simple_lap_z_facto0_sched0_kwayprojections_svdend ........... Passed 124.12 sec Start 878: shm_example_simple_lap_d_facto0_sched1_not_rqrrtbegin 367/3626 Test #597: shm_example_simple_lap_c_facto4_sched0_kway_pqrcpilu1 ................... Passed 124.63 sec Start 879: shm_example_simple_lap_d_facto0_sched1_not_rqrrtend 368/3626 Test #590: shm_example_simple_lap_c_facto4_sched0_not_rqrrtbegin ................... Passed 125.32 sec Start 880: shm_example_simple_lap_d_facto0_sched1_kway_rqrrtbegin 369/3626 Test #602: shm_example_simple_lap_z_facto0_sched0_kwayprojections_svdbegin ......... Passed 125.34 sec Start 881: shm_example_simple_lap_d_facto0_sched1_kway_rqrrtend 370/3626 Test #598: shm_example_simple_lap_z_facto0_sched0_not_svdbegin ..................... Passed 125.84 sec Start 882: shm_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtbegin 371/3626 Test #608: shm_example_simple_lap_z_facto0_sched0_kwayprojections_pqrcpbegin ....... Passed 125.14 sec Start 883: shm_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtend 372/3626 Test #609: shm_example_simple_lap_z_facto0_sched0_kwayprojections_pqrcpend ......... Passed 126.11 sec Start 884: shm_example_simple_lap_d_facto0_sched1_kway_pqrcpilu0 373/3626 Test #611: shm_example_simple_lap_z_facto0_sched0_not_rqrcpend ..................... Passed 126.66 sec Start 885: shm_example_simple_lap_d_facto0_sched1_kway_pqrcpilu1 374/3626 Test #623: shm_example_simple_lap_z_facto0_sched0_not_rqrrtend ..................... Passed 126.29 sec Start 886: shm_example_simple_lap_d_facto1_sched1_not_svdbegin 375/3626 Test #619: shm_example_simple_lap_z_facto0_sched0_kway_tqrcpend .................... Passed 126.66 sec Start 887: shm_example_simple_lap_d_facto1_sched1_not_svdend 376/3626 Test #601: shm_example_simple_lap_z_facto0_sched0_kway_svdend ...................... Passed 127.93 sec Start 888: shm_example_simple_lap_d_facto1_sched1_kway_svdbegin 377/3626 Test #561: shm_example_simple_lap_c_facto3_sched0_kway_rqrrtend .................... Passed 130.63 sec Start 889: shm_example_simple_lap_d_facto1_sched1_kway_svdend 378/3626 Test #607: shm_example_simple_lap_z_facto0_sched0_kway_pqrcpend .................... Passed 127.44 sec Start 890: shm_example_simple_lap_d_facto1_sched1_kwayprojections_svdbegin 379/3626 Test #604: shm_example_simple_lap_z_facto0_sched0_not_pqrcpbegin ................... Passed 127.86 sec Start 891: shm_example_simple_lap_d_facto1_sched1_kwayprojections_svdend 380/3626 Test #610: shm_example_simple_lap_z_facto0_sched0_not_rqrcpbegin ................... Passed 127.56 sec Start 892: shm_example_simple_lap_d_facto1_sched1_not_pqrcpbegin 381/3626 Test #625: shm_example_simple_lap_z_facto0_sched0_kway_rqrrtend .................... Passed 126.77 sec Start 893: shm_example_simple_lap_d_facto1_sched1_not_pqrcpend 382/3626 Test #617: shm_example_simple_lap_z_facto0_sched0_not_tqrcpend ..................... Passed 127.26 sec Start 894: shm_example_simple_lap_d_facto1_sched1_kway_pqrcpbegin 383/3626 Test #599: shm_example_simple_lap_z_facto0_sched0_not_svdend ....................... Passed 128.46 sec Start 895: shm_example_simple_lap_d_facto1_sched1_kway_pqrcpend 384/3626 Test #600: shm_example_simple_lap_z_facto0_sched0_kway_svdbegin .................... Passed 128.46 sec Start 896: shm_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpbegin 385/3626 Test #605: shm_example_simple_lap_z_facto0_sched0_not_pqrcpend ..................... Passed 128.36 sec Start 897: shm_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpend 386/3626 Test #628: shm_example_simple_lap_z_facto0_sched0_kway_pqrcpilu0 ................... Passed 126.94 sec Start 898: shm_example_simple_lap_d_facto1_sched1_not_rqrcpbegin 387/3626 Test #614: shm_example_simple_lap_z_facto0_sched0_kwayprojections_rqrcpbegin ....... Passed 128.01 sec Start 899: shm_example_simple_lap_d_facto1_sched1_not_rqrcpend 388/3626 Test #579: shm_example_simple_lap_c_facto4_sched0_not_rqrcpend ..................... Passed 130.40 sec Start 900: shm_example_simple_lap_d_facto1_sched1_kway_rqrcpbegin 389/3626 Test #613: shm_example_simple_lap_z_facto0_sched0_kway_rqrcpend .................... Passed 128.50 sec Start 901: shm_example_simple_lap_d_facto1_sched1_kway_rqrcpend 390/3626 Test #620: shm_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpbegin ....... Passed 128.10 sec Start 902: shm_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpbegin 391/3626 Test #624: shm_example_simple_lap_z_facto0_sched0_kway_rqrrtbegin .................. Passed 128.11 sec Start 903: shm_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpend 392/3626 Test #580: shm_example_simple_lap_c_facto4_sched0_kway_rqrcpbegin .................. Passed 132.28 sec Start 904: shm_example_simple_lap_d_facto1_sched1_not_tqrcpbegin 393/3626 Test #556: shm_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpbegin ....... Passed 134.35 sec Start 905: shm_example_simple_lap_d_facto1_sched1_not_tqrcpend 394/3626 Test #568: shm_example_simple_lap_c_facto4_sched0_kway_svdbegin .................... Passed 133.02 sec Start 906: shm_example_simple_lap_d_facto1_sched1_kway_tqrcpbegin 395/3626 Test #630: shm_example_simple_lap_z_facto1_sched0_not_svdbegin ..................... Passed 129.18 sec Start 907: shm_example_simple_lap_d_facto1_sched1_kway_tqrcpend 396/3626 Test #618: shm_example_simple_lap_z_facto0_sched0_kway_tqrcpbegin .................. Passed 130.05 sec Start 908: shm_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpbegin 397/3626 Test #622: shm_example_simple_lap_z_facto0_sched0_not_rqrrtbegin ................... Passed 131.63 sec Start 909: shm_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpend 398/3626 Test #629: shm_example_simple_lap_z_facto0_sched0_kway_pqrcpilu1 ................... Passed 131.61 sec Start 910: shm_example_simple_lap_d_facto1_sched1_not_rqrrtbegin 399/3626 Test #631: shm_example_simple_lap_z_facto1_sched0_not_svdend ....................... Passed 131.73 sec Start 911: shm_example_simple_lap_d_facto1_sched1_not_rqrrtend 400/3626 Test #595: shm_example_simple_lap_c_facto4_sched0_kwayprojections_rqrrtend ......... Passed 134.42 sec Start 912: shm_example_simple_lap_d_facto1_sched1_kway_rqrrtbegin 401/3626 Test #589: shm_example_simple_lap_c_facto4_sched0_kwayprojections_tqrcpend ......... Passed 134.96 sec Start 913: shm_example_simple_lap_d_facto1_sched1_kway_rqrrtend 402/3626 Test #616: shm_example_simple_lap_z_facto0_sched0_not_tqrcpbegin ................... Passed 133.86 sec Start 914: shm_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtbegin 403/3626 Test #632: shm_example_simple_lap_z_facto1_sched0_kway_svdbegin .................... Passed 134.94 sec Start 915: shm_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtend 404/3626 Test #615: shm_example_simple_lap_z_facto0_sched0_kwayprojections_rqrcpend ......... Passed 136.84 sec Start 916: shm_example_simple_lap_d_facto1_sched1_kway_pqrcpilu0 405/3626 Test #640: shm_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpbegin ....... Passed 135.79 sec Start 917: shm_example_simple_lap_d_facto1_sched1_kway_pqrcpilu1 406/3626 Test #645: shm_example_simple_lap_z_facto1_sched0_kway_rqrcpend .................... Passed 136.20 sec Start 918: shm_example_simple_lap_d_facto2_sched1_not_svdbegin 407/3626 Test #644: shm_example_simple_lap_z_facto1_sched0_kway_rqrcpbegin .................. Passed 136.25 sec Start 919: shm_example_simple_lap_d_facto2_sched1_not_svdend 408/3626 Test #641: shm_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpend ......... Passed 136.44 sec Start 920: shm_example_simple_lap_d_facto2_sched1_kway_svdbegin 409/3626 Test #636: shm_example_simple_lap_z_facto1_sched0_not_pqrcpbegin ................... Passed 136.57 sec Start 921: shm_example_simple_lap_d_facto2_sched1_kway_svdend 410/3626 Test #635: shm_example_simple_lap_z_facto1_sched0_kwayprojections_svdend ........... Passed 136.66 sec Start 922: shm_example_simple_lap_d_facto2_sched1_kwayprojections_svdbegin 411/3626 Test #642: shm_example_simple_lap_z_facto1_sched0_not_rqrcpbegin ................... Passed 136.78 sec Start 923: shm_example_simple_lap_d_facto2_sched1_kwayprojections_svdend 412/3626 Test #612: shm_example_simple_lap_z_facto0_sched0_kway_rqrcpbegin .................. Passed 139.58 sec Start 924: shm_example_simple_lap_d_facto2_sched1_not_pqrcpbegin 413/3626 Test #643: shm_example_simple_lap_z_facto1_sched0_not_rqrcpend ..................... Passed 137.10 sec Start 925: shm_example_simple_lap_d_facto2_sched1_not_pqrcpend 414/3626 Test #657: shm_example_simple_lap_z_facto1_sched0_kway_rqrrtend .................... Passed 136.66 sec Start 926: shm_example_simple_lap_d_facto2_sched1_kway_pqrcpbegin 415/3626 Test #647: shm_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpend ......... Passed 138.14 sec Start 927: shm_example_simple_lap_d_facto2_sched1_kway_pqrcpend 416/3626 Test #654: shm_example_simple_lap_z_facto1_sched0_not_rqrrtbegin ................... Passed 138.32 sec Start 928: shm_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpbegin 417/3626 Test #555: shm_example_simple_lap_c_facto3_sched0_kway_tqrcpend .................... Passed 145.72 sec Start 929: shm_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpend 418/3626 Test #650: shm_example_simple_lap_z_facto1_sched0_kway_tqrcpbegin .................. Passed 139.35 sec Start 930: shm_example_simple_lap_d_facto2_sched1_not_rqrcpbegin 419/3626 Test #626: shm_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtbegin ....... Passed 142.29 sec Start 931: shm_example_simple_lap_d_facto2_sched1_not_rqrcpend 420/3626 Test #633: shm_example_simple_lap_z_facto1_sched0_kway_svdend ...................... Passed 145.82 sec Start 932: shm_example_simple_lap_d_facto2_sched1_kway_rqrcpbegin 421/3626 Test #659: shm_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtend ......... Passed 143.86 sec Start 933: shm_example_simple_lap_d_facto2_sched1_kway_rqrcpend 422/3626 Test #649: shm_example_simple_lap_z_facto1_sched0_not_tqrcpend ..................... Passed 145.31 sec Start 934: shm_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpbegin 423/3626 Test #648: shm_example_simple_lap_z_facto1_sched0_not_tqrcpbegin ................... Passed 147.53 sec Start 935: shm_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpend 424/3626 Test #621: shm_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpend ......... Passed 150.25 sec Start 936: shm_example_simple_lap_d_facto2_sched1_not_tqrcpend 425/3626 Test #653: shm_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpend ......... Passed 147.80 sec Start 937: shm_example_simple_lap_d_facto2_sched1_kway_tqrcpbegin 426/3626 Test #646: shm_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpbegin ....... Passed 150.08 sec Start 938: shm_example_simple_lap_d_facto2_sched1_kway_tqrcpend 427/3626 Test #651: shm_example_simple_lap_z_facto1_sched0_kway_tqrcpend .................... Passed 150.64 sec Start 939: shm_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpbegin 428/3626 Test #592: shm_example_simple_lap_c_facto4_sched0_kway_rqrrtbegin .................. Passed 156.16 sec Start 940: shm_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpend 429/3626 Test #661: shm_example_simple_lap_z_facto1_sched0_kway_pqrcpilu1 ................... Passed 152.67 sec Start 941: shm_example_simple_lap_d_facto2_sched1_not_rqrrtbegin 430/3626 Test #673: shm_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpend ......... Passed 153.74 sec Start 942: shm_example_simple_lap_d_facto2_sched1_not_rqrrtend 431/3626 Test #665: shm_example_simple_lap_z_facto2_sched0_kway_svdend ...................... Passed 157.05 sec Start 943: shm_example_simple_lap_d_facto2_sched1_kway_rqrrtbegin 432/3626 Test #683: shm_example_simple_lap_z_facto2_sched0_kway_tqrcpend .................... Passed 154.85 sec Start 944: shm_example_simple_lap_d_facto2_sched1_kway_rqrrtend 433/3626 Test #670: shm_example_simple_lap_z_facto2_sched0_kway_pqrcpbegin .................. Passed 156.66 sec Start 945: shm_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtbegin 434/3626 Test #675: shm_example_simple_lap_z_facto2_sched0_not_rqrcpend ..................... Passed 158.29 sec Start 946: shm_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtend 435/3626 Test #666: shm_example_simple_lap_z_facto2_sched0_kwayprojections_svdbegin ......... Passed 159.75 sec Start 947: shm_example_simple_lap_d_facto2_sched1_kway_pqrcpilu0 436/3626 Test #627: shm_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtend ......... Passed 165.32 sec Start 948: shm_example_simple_lap_d_facto2_sched1_kway_pqrcpilu1 437/3626 Test #688: shm_example_simple_lap_z_facto2_sched0_kway_rqrrtbegin .................. Passed 160.16 sec Start 949: shm_example_simple_lap_c_facto0_sched1_not_svdbegin 438/3626 Test #694: shm_example_simple_lap_z_facto3_sched0_not_svdbegin ..................... Passed 161.77 sec Start 950: shm_example_simple_lap_c_facto0_sched1_not_svdend 439/3626 Test #681: shm_example_simple_lap_z_facto2_sched0_not_tqrcpend ..................... Passed 163.40 sec Start 951: shm_example_simple_lap_c_facto0_sched1_kway_svdbegin 440/3626 Test #652: shm_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpbegin ....... Passed 167.92 sec Start 952: shm_example_simple_lap_c_facto0_sched1_kway_svdend 441/3626 Test #639: shm_example_simple_lap_z_facto1_sched0_kway_pqrcpend .................... Passed 171.81 sec Start 953: shm_example_simple_lap_c_facto0_sched1_kwayprojections_svdbegin 442/3626 Test #689: shm_example_simple_lap_z_facto2_sched0_kway_rqrrtend .................... Passed 166.26 sec Start 954: shm_example_simple_lap_c_facto0_sched1_kwayprojections_svdend 443/3626 Test #660: shm_example_simple_lap_z_facto1_sched0_kway_pqrcpilu0 ................... Passed 170.12 sec Start 955: shm_example_simple_lap_c_facto0_sched1_not_pqrcpbegin 444/3626 Test #697: shm_example_simple_lap_z_facto3_sched0_kway_svdend ...................... Passed 166.35 sec Start 956: shm_example_simple_lap_c_facto0_sched1_not_pqrcpend 445/3626 Test #606: shm_example_simple_lap_z_facto0_sched0_kway_pqrcpbegin .................. Passed 176.73 sec Start 957: shm_example_simple_lap_c_facto0_sched1_kway_pqrcpbegin 446/3626 Test #655: shm_example_simple_lap_z_facto1_sched0_not_rqrrtend ..................... Passed 173.18 sec Start 958: shm_example_simple_lap_c_facto0_sched1_kway_pqrcpend 447/3626 Test #664: shm_example_simple_lap_z_facto2_sched0_kway_svdbegin .................... Passed 171.63 sec Start 959: shm_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpbegin 448/3626 Test #667: shm_example_simple_lap_z_facto2_sched0_kwayprojections_svdend ........... Passed 172.52 sec Start 960: shm_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpend 449/3626 Test #706: shm_example_simple_lap_z_facto3_sched0_not_rqrcpbegin ................... Passed 168.45 sec Start 961: shm_example_simple_lap_c_facto0_sched1_not_rqrcpbegin 450/3626 Test #708: shm_example_simple_lap_z_facto3_sched0_kway_rqrcpbegin .................. Passed 168.26 sec Start 962: shm_example_simple_lap_c_facto0_sched1_not_rqrcpend 451/3626 Test #705: shm_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpend ......... Passed 168.82 sec Start 963: shm_example_simple_lap_c_facto0_sched1_kway_rqrcpbegin 452/3626 Test #686: shm_example_simple_lap_z_facto2_sched0_not_rqrrtbegin ................... Passed 170.48 sec Start 964: shm_example_simple_lap_c_facto0_sched1_kway_rqrcpend 453/3626 Test #658: shm_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtbegin ....... Passed 174.96 sec Start 965: shm_example_simple_lap_c_facto0_sched1_kwayprojections_rqrcpbegin 454/3626 Test #634: shm_example_simple_lap_z_facto1_sched0_kwayprojections_svdbegin ......... Passed 176.38 sec Start 966: shm_example_simple_lap_c_facto0_sched1_kwayprojections_rqrcpend 455/3626 Test #637: shm_example_simple_lap_z_facto1_sched0_not_pqrcpend ..................... Passed 178.63 sec Start 967: shm_example_simple_lap_c_facto0_sched1_not_tqrcpbegin 456/3626 Test #671: shm_example_simple_lap_z_facto2_sched0_kway_pqrcpend .................... Passed 175.62 sec Start 968: shm_example_simple_lap_c_facto0_sched1_not_tqrcpend 457/3626 Test #663: shm_example_simple_lap_z_facto2_sched0_not_svdend ....................... Passed 177.77 sec Start 969: shm_example_simple_lap_c_facto0_sched1_kway_tqrcpbegin 458/3626 Test #692: shm_example_simple_lap_z_facto2_sched0_kway_pqrcpilu0 ................... Passed 175.09 sec Start 970: shm_example_simple_lap_c_facto0_sched1_kway_tqrcpend 459/3626 Test #725: shm_example_simple_lap_z_facto3_sched0_kway_pqrcpilu1 ................... Passed 171.13 sec Start 971: shm_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpbegin 460/3626 Test #668: shm_example_simple_lap_z_facto2_sched0_not_pqrcpbegin ................... Passed 178.26 sec Start 972: shm_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpend 461/3626 Test #718: shm_example_simple_lap_z_facto3_sched0_not_rqrrtbegin ................... Passed 172.48 sec Start 973: shm_example_simple_lap_c_facto0_sched1_not_rqrrtbegin 462/3626 Test #700: shm_example_simple_lap_z_facto3_sched0_not_pqrcpbegin ................... Passed 175.40 sec Start 974: shm_example_simple_lap_c_facto0_sched1_not_rqrrtend 463/3626 Test #685: shm_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpend ......... Passed 177.58 sec Start 975: shm_example_simple_lap_c_facto0_sched1_kway_rqrrtbegin 464/3626 Test #704: shm_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpbegin ....... Passed 177.23 sec Start 976: shm_example_simple_lap_c_facto0_sched1_kway_rqrrtend 465/3626 Test #690: shm_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtbegin ....... Passed 178.94 sec Start 977: shm_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtbegin 466/3626 Test #695: shm_example_simple_lap_z_facto3_sched0_not_svdend ....................... Passed 180.68 sec Start 978: shm_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtend 467/3626 Test #674: shm_example_simple_lap_z_facto2_sched0_not_rqrcpbegin ................... Passed 183.79 sec Start 979: shm_example_simple_lap_c_facto0_sched1_kway_pqrcpilu0 468/3626 Test #662: shm_example_simple_lap_z_facto2_sched0_not_svdbegin ..................... Passed 186.57 sec Start 980: shm_example_simple_lap_c_facto0_sched1_kway_pqrcpilu1 469/3626 Test #691: shm_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtend ......... Passed 185.58 sec Start 981: shm_example_simple_lap_c_facto1_sched1_not_svdbegin 470/3626 Test #703: shm_example_simple_lap_z_facto3_sched0_kway_pqrcpend .................... Passed 186.02 sec Start 982: shm_example_simple_lap_c_facto1_sched1_not_svdend 471/3626 Test #726: shm_example_simple_lap_z_facto4_sched0_not_svdbegin ..................... Passed 182.70 sec Start 983: shm_example_simple_lap_c_facto1_sched1_kway_svdbegin 472/3626 Test #669: shm_example_simple_lap_z_facto2_sched0_not_pqrcpend ..................... Passed 189.49 sec Start 984: shm_example_simple_lap_c_facto1_sched1_kway_svdend 473/3626 Test #699: shm_example_simple_lap_z_facto3_sched0_kwayprojections_svdend ........... Passed 186.42 sec Start 985: shm_example_simple_lap_c_facto1_sched1_kwayprojections_svdbegin 474/3626 Test #713: shm_example_simple_lap_z_facto3_sched0_not_tqrcpend ..................... Passed 185.02 sec Start 986: shm_example_simple_lap_c_facto1_sched1_kwayprojections_svdend 475/3626 Test #728: shm_example_simple_lap_z_facto4_sched0_kway_svdbegin .................... Passed 183.61 sec Start 987: shm_example_simple_lap_c_facto1_sched1_not_pqrcpbegin 476/3626 Test #709: shm_example_simple_lap_z_facto3_sched0_kway_rqrcpend .................... Passed 186.81 sec Start 988: shm_example_simple_lap_c_facto1_sched1_not_pqrcpend 477/3626 Test #720: shm_example_simple_lap_z_facto3_sched0_kway_rqrrtbegin .................. Passed 185.36 sec Start 989: shm_example_simple_lap_c_facto1_sched1_kway_pqrcpbegin 478/3626 Test #729: shm_example_simple_lap_z_facto4_sched0_kway_svdend ...................... Passed 184.18 sec Start 990: shm_example_simple_lap_c_facto1_sched1_kway_pqrcpend 479/3626 Test #680: shm_example_simple_lap_z_facto2_sched0_not_tqrcpbegin ................... Passed 191.02 sec Start 991: shm_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpbegin 480/3626 Test #710: shm_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpbegin ....... Passed 188.44 sec Start 992: shm_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpend Test #13: c_shm_example_analyze_lap_z_facto1 ...................................... Passed 188.04 sec Start 993: shm_example_simple_lap_c_facto1_sched1_not_rqrcpbegin 482/3626 Test #719: shm_example_simple_lap_z_facto3_sched0_not_rqrrtend ..................... Passed 190.14 sec Start 994: shm_example_simple_lap_c_facto1_sched1_not_rqrcpend Test #87: c_shm_example_schur_lap_z_facto0 ........................................ Passed 186.29 sec Start 995: shm_example_simple_lap_c_facto1_sched1_kway_rqrcpbegin 484/3626 Test #707: shm_example_simple_lap_z_facto3_sched0_not_rqrcpend ..................... Passed 192.55 sec Start 996: shm_example_simple_lap_c_facto1_sched1_kway_rqrcpend 485/3626 Test #701: shm_example_simple_lap_z_facto3_sched0_not_pqrcpend ..................... Passed 193.86 sec Start 997: shm_example_simple_lap_c_facto1_sched1_kwayprojections_rqrcpbegin 486/3626 Test #638: shm_example_simple_lap_z_facto1_sched0_kway_pqrcpbegin ..................***Timeout 200.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 638: shm_example_simple_lap_z_facto1_sched0_kway_pqrcpbegin 486/3626 Test #715: shm_example_simple_lap_z_facto3_sched0_kway_tqrcpend .................... Passed 192.31 sec Start 998: shm_example_simple_lap_c_facto1_sched1_kwayprojections_rqrcpend 487/3626 Test #656: shm_example_simple_lap_z_facto1_sched0_kway_rqrrtbegin ..................***Timeout 200.82 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 656: shm_example_simple_lap_z_facto1_sched0_kway_rqrrtbegin Test #51: c_shm_example_simple_trans_lap_s_facto2 ................................. Passed 189.68 sec Start 999: shm_example_simple_lap_c_facto1_sched1_not_tqrcpbegin Test #25: c_shm_example_simple_lap_c_facto2 ....................................... Passed 190.68 sec Start 1000: shm_example_simple_lap_c_facto1_sched1_not_tqrcpend 489/3626 Test #693: shm_example_simple_lap_z_facto2_sched0_kway_pqrcpilu1 ................... Passed 196.31 sec Test #50: c_shm_example_simple_trans_lap_s_facto1 ................................. Passed 189.95 sec Start 1001: shm_example_simple_lap_c_facto1_sched1_kway_tqrcpbegin Start 1002: shm_example_simple_lap_c_facto1_sched1_kway_tqrcpend Test #26: c_shm_example_simple_lap_c_facto3 ....................................... Passed 190.66 sec Start 1003: shm_example_simple_lap_c_facto1_sched1_kwayprojections_tqrcpbegin 492/3626 Test #717: shm_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpend ......... Passed 193.72 sec Start 1004: shm_example_simple_lap_c_facto1_sched1_kwayprojections_tqrcpend 493/3626 Test #672: shm_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpbegin .......***Timeout 199.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 672: shm_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpbegin 493/3626 Test #676: shm_example_simple_lap_z_facto2_sched0_kway_rqrcpbegin ..................***Timeout 199.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 676: shm_example_simple_lap_z_facto2_sched0_kway_rqrcpbegin 493/3626 Test #684: shm_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpbegin ....... Passed 198.25 sec Start 1005: shm_example_simple_lap_c_facto1_sched1_not_rqrrtbegin Test #479: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpend .................... Passed 187.81 sec Start 1006: shm_example_simple_lap_c_facto1_sched1_not_rqrrtend Test #428: shm_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpbegin ....... Passed 188.20 sec Start 1007: shm_example_simple_lap_c_facto1_sched1_kway_rqrrtbegin Test #359: shm_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpend ......... Passed 188.57 sec Start 1008: shm_example_simple_lap_c_facto1_sched1_kway_rqrrtend Test #422: shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpbegin ....... Passed 188.37 sec Start 1009: shm_example_simple_lap_c_facto1_sched1_kwayprojections_rqrrtbegin 498/3626 Test #698: shm_example_simple_lap_z_facto3_sched0_kwayprojections_svdbegin ......... Passed 197.67 sec Start 1010: shm_example_simple_lap_c_facto1_sched1_kwayprojections_rqrrtend 499/3626 Test #723: shm_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtend ......... Passed 194.47 sec Start 1011: shm_example_simple_lap_c_facto1_sched1_kway_pqrcpilu0 Test #288: shm_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpbegin ....... Passed 189.16 sec Start 1012: shm_example_simple_lap_c_facto1_sched1_kway_pqrcpilu1 501/3626 Test #696: shm_example_simple_lap_z_facto3_sched0_kway_svdbegin .................... Passed 198.13 sec Start 1013: shm_example_simple_lap_c_facto2_sched1_not_svdbegin Test #309: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpilu1 ................... Passed 189.25 sec Start 1014: shm_example_simple_lap_c_facto2_sched1_not_svdend 503/3626 Test #714: shm_example_simple_lap_z_facto3_sched0_kway_tqrcpbegin .................. Passed 196.15 sec Start 1015: shm_example_simple_lap_c_facto2_sched1_kway_svdbegin Test #24: c_shm_example_simple_lap_c_facto1 ....................................... Passed 193.49 sec Start 1016: shm_example_simple_lap_c_facto2_sched1_kway_svdend 505/3626 Test #731: shm_example_simple_lap_z_facto4_sched0_kwayprojections_svdend ........... Passed 194.16 sec Start 1017: shm_example_simple_lap_c_facto2_sched1_kwayprojections_svdbegin 506/3626 Test #682: shm_example_simple_lap_z_facto2_sched0_kway_tqrcpbegin ..................***Timeout 200.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 682: shm_example_simple_lap_z_facto2_sched0_kway_tqrcpbegin 506/3626 Test #687: shm_example_simple_lap_z_facto2_sched0_not_rqrrtend .....................***Timeout 200.09 sec ischedInit: The thread number has been automatically set to 256 Start 687: shm_example_simple_lap_z_facto2_sched0_not_rqrrtend 506/3626 Test #702: shm_example_simple_lap_z_facto3_sched0_kway_pqrcpbegin ..................***Timeout 198.93 sec Start 702: shm_example_simple_lap_z_facto3_sched0_kway_pqrcpbegin 506/3626 Test #711: shm_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpend .........***Timeout 197.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 711: shm_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpend 506/3626 Test #716: shm_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpbegin .......***Timeout 197.22 sec Start 716: shm_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpbegin Test #378: shm_example_simple_lap_d_facto1_sched0_kwayprojections_svdbegin ......... Passed 190.45 sec 507/3626 Test #727: shm_example_simple_lap_z_facto4_sched0_not_svdend .......................***Timeout 195.60 sec Start 727: shm_example_simple_lap_z_facto4_sched0_not_svdend 507/3626 Test #722: shm_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtbegin .......***Timeout 196.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 722: shm_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtbegin Start 1018: shm_example_simple_lap_c_facto2_sched1_kwayprojections_svdend Test #508: shm_example_simple_lap_c_facto2_sched0_not_pqrcpbegin ................... Passed 189.94 sec Test #413: shm_example_simple_lap_d_facto2_sched0_not_pqrcpend ..................... Passed 190.58 sec 509/3626 Test #724: shm_example_simple_lap_z_facto3_sched0_kway_pqrcpilu0 ...................***Timeout 196.20 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 724: shm_example_simple_lap_z_facto3_sched0_kway_pqrcpilu0 509/3626 Test #733: shm_example_simple_lap_z_facto4_sched0_not_pqrcpend ..................... Passed 189.94 sec Start 1019: shm_example_simple_lap_c_facto2_sched1_not_pqrcpbegin Start 1020: shm_example_simple_lap_c_facto2_sched1_not_pqrcpend Start 1021: shm_example_simple_lap_c_facto2_sched1_kway_pqrcpbegin 510/3626 Test #730: shm_example_simple_lap_z_facto4_sched0_kwayprojections_svdbegin .........***Timeout 196.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 730: shm_example_simple_lap_z_facto4_sched0_kwayprojections_svdbegin 510/3626 Test #732: shm_example_simple_lap_z_facto4_sched0_not_pqrcpbegin ...................***Timeout 196.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 732: shm_example_simple_lap_z_facto4_sched0_not_pqrcpbegin Test #171: c_shm_example_simple_mixed_lap_z_refine_cg_sym .......................... Passed 192.26 sec Test #178: shm_example_simple_lap_d_facto1_sched0_1d ............................... Passed 192.22 sec Test #300: shm_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpbegin ....... Passed 191.74 sec Start 1022: shm_example_simple_lap_c_facto2_sched1_kway_pqrcpend Start 1023: shm_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpbegin Start 1024: shm_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpend Test #135: c_shm_example_simple_refine_bicgstab .................................... Passed 192.92 sec Start 1025: shm_example_simple_lap_c_facto2_sched1_not_rqrcpbegin Test #198: shm_example_simple_lap_c_facto2_sched1_1d ............................... Passed 192.39 sec Start 1026: shm_example_simple_lap_c_facto2_sched1_not_rqrcpend 515/3626 Test #736: shm_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpbegin ....... Passed 191.11 sec Start 1027: shm_example_simple_lap_c_facto2_sched1_kway_rqrcpbegin 516/3626 Test #678: shm_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpbegin .......***Timeout 202.73 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 678: shm_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpbegin Test #204: shm_example_simple_lap_z_facto3_sched1_1d ............................... Passed 193.09 sec Start 1028: shm_example_simple_lap_c_facto2_sched1_kway_rqrcpend Test #437: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpilu1 ................... Passed 192.16 sec Start 1029: shm_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpbegin 518/3626 Test #679: shm_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpend .........***Timeout 203.35 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 679: shm_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpend 518/3626 Test #677: shm_example_simple_lap_z_facto2_sched0_kway_rqrcpend ....................***Timeout 203.84 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 677: shm_example_simple_lap_z_facto2_sched0_kway_rqrcpend 518/3626 Test #712: shm_example_simple_lap_z_facto3_sched0_not_tqrcpbegin ...................***Timeout 200.64 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 712: shm_example_simple_lap_z_facto3_sched0_not_tqrcpbegin 518/3626 Test #734: shm_example_simple_lap_z_facto4_sched0_kway_pqrcpbegin .................. Passed 192.48 sec 519/3626 Test #756: shm_example_simple_lap_z_facto4_sched0_kway_pqrcpilu0 ................... Passed 192.28 sec Start 1030: shm_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpend Start 1031: shm_example_simple_lap_c_facto2_sched1_not_tqrcpbegin 520/3626 Test #749: shm_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpend ......... Passed 192.43 sec Test #158: c_shm_example_simple_mixed_lap_d_facto1 ................................. Passed 194.43 sec Start 1032: shm_example_simple_lap_c_facto2_sched1_not_tqrcpend Start 1033: shm_example_simple_lap_c_facto2_sched1_kway_tqrcpbegin Test #280: shm_example_simple_lap_s_facto1_sched0_kway_svdbegin .................... Passed 194.00 sec Start 1034: shm_example_simple_lap_c_facto2_sched1_kway_tqrcpend 523/3626 Test #765: shm_example_simple_lap_s_facto0_sched1_not_pqrcpend ..................... Passed 192.60 sec Start 1035: shm_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpbegin 524/3626 Test #803: shm_example_simple_lap_s_facto1_sched1_not_rqrcpend ..................... Passed 192.42 sec Start 1036: shm_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpend 525/3626 Test #755: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtend ......... Passed 193.22 sec Start 1037: shm_example_simple_lap_c_facto2_sched1_not_rqrrtbegin 526/3626 Test #721: shm_example_simple_lap_z_facto3_sched0_kway_rqrrtend ....................***Timeout 200.59 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 721: shm_example_simple_lap_z_facto3_sched0_kway_rqrrtend Test #209: shm_example_simple_lap_d_facto0_sched4_1d ............................... Passed 195.57 sec Start 1038: shm_example_simple_lap_c_facto2_sched1_not_rqrrtend Test #32: c_shm_example_simple_lap_z_facto4 .......................................***Timeout 206.81 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.223271e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.328411e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.101537e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.593437e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.904817e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 7.084973e-02 s Time to initialize coeftab 1.262023e-01 s Time to factorize 2.317252e+00 s ( 9.20 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.690011e+00 s Time for refinement 9.089219e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.805848e-16 max(|| b_i - A x_i ||_1) 1.874448e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.729867e-03 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 1.535363e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.250518e-03 (SUCCESS) Start 32: c_shm_example_simple_lap_z_facto4 Test #56: c_shm_example_simple_trans_lap_c_facto1 .................................***Timeout 206.49 sec Start 56: c_shm_example_simple_trans_lap_c_facto1 Test #57: c_shm_example_simple_trans_lap_c_facto2 .................................***Timeout 206.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.421108e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.427500e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.527740e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.940339e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.830926e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.838764e-02 s Time to initialize coeftab 1.289752e-01 s Time to factorize 3.300596e+00 s (12.11 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.928583e+00 s Time for refinement 9.970140e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.052328e-07 max(|| b_i - A x_i ||_1) 8.864194e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.236701e+00 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 3.919884e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 5.745723e-01 (SUCCESS) Start 57: c_shm_example_simple_trans_lap_c_facto2 Test #62: c_shm_example_simple_trans_lap_z_facto2 .................................***Timeout 206.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 62: c_shm_example_simple_trans_lap_z_facto2 Test #155: c_shm_example_simple_mixed_refine_gmres .................................***Timeout 204.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 Start 155: c_shm_example_simple_mixed_refine_gmres Test #156: c_shm_example_simple_mixed_refine_bicgstab ..............................***Timeout 204.82 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 Start 156: c_shm_example_simple_mixed_refine_bicgstab Test #157: c_shm_example_simple_mixed_lap_d_facto0 .................................***Timeout 204.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.385669e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.482543e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.985737e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.864402e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.391432e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.322602e-03 s Time to initialize coeftab 8.062083e-02 s Time to factorize 1.076805e+00 s ( 4.70 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 1.794701e+00 s - iteration 1 : total iteration time 1.41 s error 5.6458e-14 Time for refinement 2.590139e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.644951e-14 max(|| b_i - A x_i ||_1) 1.696053e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.131236e-01 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.650347e-13 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 3.303264e-01 (SUCCESS) Start 157: c_shm_example_simple_mixed_lap_d_facto0 Test #160: c_shm_example_simple_mixed_lap_d_refine_cg_sym ..........................***Timeout 205.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 160: c_shm_example_simple_mixed_lap_d_refine_cg_sym Test #162: c_shm_example_simple_mixed_lap_d_refine_bicgstab_sym ....................***Timeout 205.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 162: c_shm_example_simple_mixed_lap_d_refine_bicgstab_sym Test #163: c_shm_example_simple_mixed_lap_z_facto0 .................................***Timeout 205.07 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 163: c_shm_example_simple_mixed_lap_z_facto0 Test #165: c_shm_example_simple_mixed_lap_z_facto2 .................................***Timeout 205.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 165: c_shm_example_simple_mixed_lap_z_facto2 Test #166: c_shm_example_simple_mixed_lap_z_facto3 .................................***Timeout 205.46 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 166: c_shm_example_simple_mixed_lap_z_facto3 Test #169: c_shm_example_simple_mixed_lap_z_refine_gmres_her .......................***Timeout 205.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 Start 169: c_shm_example_simple_mixed_lap_z_refine_gmres_her Test #172: c_shm_example_simple_mixed_lap_z_refine_gmres_sym .......................***Timeout 205.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 172: c_shm_example_simple_mixed_lap_z_refine_gmres_sym Test #197: shm_example_simple_lap_c_facto1_sched1_1d ...............................***Timeout 205.63 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.507070e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.125875e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.519924e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.576721e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.937982e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.468047e-03 s Time to initialize coeftab 6.121319e-02 s Time to factorize 2.745836e+00 s ( 7.76 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 6.472246e-01 s Time for refinement 5.807203e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.048581e-07 max(|| b_i - A x_i ||_1) 8.778942e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.215190e+00 (SUCCESS) Start 197: shm_example_simple_lap_c_facto1_sched1_1d Test #200: shm_example_simple_lap_c_facto4_sched1_1d ...............................***Timeout 205.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 200: shm_example_simple_lap_c_facto4_sched1_1d Test #201: shm_example_simple_lap_z_facto0_sched1_1d ...............................***Timeout 205.73 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 201: shm_example_simple_lap_z_facto0_sched1_1d Test #203: shm_example_simple_lap_z_facto2_sched1_1d ...............................***Timeout 205.75 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 203: shm_example_simple_lap_z_facto2_sched1_1d Test #205: shm_example_simple_lap_z_facto4_sched1_1d ...............................***Timeout 205.76 sec Start 205: shm_example_simple_lap_z_facto4_sched1_1d Test #214: shm_example_simple_lap_c_facto2_sched4_1d ...............................***Timeout 205.72 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 214: shm_example_simple_lap_c_facto2_sched4_1d Test #215: shm_example_simple_lap_c_facto3_sched4_1d ...............................***Timeout 205.75 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 215: shm_example_simple_lap_c_facto3_sched4_1d Test #216: shm_example_simple_lap_c_facto4_sched4_1d ...............................***Timeout 205.78 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 216: shm_example_simple_lap_c_facto4_sched4_1d Test #218: shm_example_simple_lap_z_facto1_sched4_1d ...............................***Timeout 205.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 218: shm_example_simple_lap_z_facto1_sched4_1d Test #219: shm_example_simple_lap_z_facto2_sched4_1d ...............................***Timeout 206.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.685776e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.348798e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.508420e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.592560e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.058838e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.400203e-03 s Time to initialize coeftab 1.092533e-01 s Time to factorize 2.830474e+00 s (14.12 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 2.49 Mo Time to solve 1.210593e+00 s Time for refinement 1.416630e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.673180e-16 max(|| b_i - A x_i ||_1) 1.774783e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.478379e-03 (SUCCESS) Start 219: shm_example_simple_lap_z_facto2_sched4_1d Test #252: shm_example_simple_lap_s_facto0_sched0_not_pqrcpbegin ...................***Timeout 206.36 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 252: shm_example_simple_lap_s_facto0_sched0_not_pqrcpbegin Test #258: shm_example_simple_lap_s_facto0_sched0_not_rqrcpbegin ...................***Timeout 206.46 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.581328e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.243946e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.054928e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.234221e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.054562e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.330119e-03 s Time to initialize coeftab 5.511354e-01 s Time to factorize 4.007384e-01 s (12.63 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 5.454516e-03 s - iteration 1 : total iteration time 0.00407 s error 3.3785e-11 Time for refinement 1.064519e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.767364e-08 max(|| b_i - A x_i ||_1) 2.783079e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.497119e-01 (SUCCESS) Start 258: shm_example_simple_lap_s_facto0_sched0_not_rqrcpbegin Test #282: shm_example_simple_lap_s_facto1_sched0_kwayprojections_svdbegin .........***Timeout 206.59 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 282: shm_example_simple_lap_s_facto1_sched0_kwayprojections_svdbegin Test #286: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpbegin ..................***Timeout 206.71 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 286: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpbegin Test #291: shm_example_simple_lap_s_facto1_sched0_not_rqrcpend .....................***Timeout 206.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.627674e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.651027e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.025472e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.431334e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.812014e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.250127e-03 s Time to initialize coeftab 1.988863e-01 s Time to factorize 1.236736e-01 s (42.32 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 5.458625e-03 s Time for refinement 3.224767e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.900394e-07 max(|| b_i - A x_i ||_1) 8.189143e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.029019e+00 (SUCCESS) Start 291: shm_example_simple_lap_s_facto1_sched0_not_rqrcpend Test #294: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpbegin .......***Timeout 206.81 sec Start 294: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpbegin Test #326: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrcpbegin .......***Timeout 206.90 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 326: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrcpbegin Test #330: shm_example_simple_lap_s_facto2_sched0_kway_tqrcpbegin ..................***Timeout 207.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 330: shm_example_simple_lap_s_facto2_sched0_kway_tqrcpbegin Test #332: shm_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpbegin .......***Timeout 207.32 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 332: shm_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpbegin Test #339: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtend .........***Timeout 207.67 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 339: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtend Test #362: shm_example_simple_lap_d_facto0_sched0_kway_tqrcpbegin ..................***Timeout 207.66 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 362: shm_example_simple_lap_d_facto0_sched0_kway_tqrcpbegin Test #363: shm_example_simple_lap_d_facto0_sched0_kway_tqrcpend ....................***Timeout 207.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.868395e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.346777e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.539589e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.048008e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.027422e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.413503e-03 s Time to initialize coeftab 2.261785e-01 s Time to factorize 4.151734e-01 s (12.19 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.142685e-02 s Time for refinement 6.230996e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.649207e-16 max(|| b_i - A x_i ||_1) 1.929752e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.424899e-03 (SUCCESS) Start 363: shm_example_simple_lap_d_facto0_sched0_kway_tqrcpend Test #364: shm_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpbegin .......***Timeout 208.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 364: shm_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpbegin Test #372: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpilu0 ...................***Timeout 208.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.837195e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.865102e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.845240e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.752148e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.131441e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.717266e-03 s Time to initialize coeftab 2.055381e-01 s Time to factorize 1.046114e-01 s (48.39 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.292844e-02 s - iteration 1 : total iteration time 0.00757 s error 5.9009e-15 Time for refinement 1.799284e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.896858e-15 max(|| b_i - A x_i ||_1) 5.917185e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.435452e-03 (SUCCESS) Start 372: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpilu0 Test #375: shm_example_simple_lap_d_facto1_sched0_not_svdend .......................***Timeout 208.47 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 375: shm_example_simple_lap_d_facto1_sched0_not_svdend Test #376: shm_example_simple_lap_d_facto1_sched0_kway_svdbegin ....................***Timeout 209.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 376: shm_example_simple_lap_d_facto1_sched0_kway_svdbegin Test #382: shm_example_simple_lap_d_facto1_sched0_kway_pqrcpbegin ..................***Timeout 209.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 382: shm_example_simple_lap_d_facto1_sched0_kway_pqrcpbegin Test #388: shm_example_simple_lap_d_facto1_sched0_kway_rqrcpbegin ..................***Timeout 209.20 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.436455e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.801617e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.464974e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.658969e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.758739e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.098028e-02 s Time to initialize coeftab 3.076188e-01 s Time to factorize 4.266197e+00 s ( 1.23 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 5.807720e-03 s - iteration 1 : total iteration time 0.00383 s error 5.4302e-14 Time for refinement 7.980445e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.430448e-14 max(|| b_i - A x_i ||_1) 5.955393e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.483463e-02 (SUCCESS) Start 388: shm_example_simple_lap_d_facto1_sched0_kway_rqrcpbegin Test #393: shm_example_simple_lap_d_facto1_sched0_not_tqrcpend .....................***Timeout 209.20 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 393: shm_example_simple_lap_d_facto1_sched0_not_tqrcpend Test #409: shm_example_simple_lap_d_facto2_sched0_kway_svdend ......................***Timeout 209.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 409: shm_example_simple_lap_d_facto2_sched0_kway_svdend Test #416: shm_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpbegin .......***Timeout 209.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 416: shm_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpbegin Test #419: shm_example_simple_lap_d_facto2_sched0_not_rqrcpend .....................***Timeout 209.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 419: shm_example_simple_lap_d_facto2_sched0_not_rqrcpend Test #420: shm_example_simple_lap_d_facto2_sched0_kway_rqrcpbegin ..................***Timeout 209.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.319782e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.486986e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.582350e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 9.336264e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.431359e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.304941e-03 s Time to initialize coeftab 7.943625e-01 s Time to factorize 1.064844e+00 s ( 9.38 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 5.239587e-03 s - iteration 1 : total iteration time 0.00388 s error 5.5242e-14 Time for refinement 1.093480e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.524210e-14 max(|| b_i - A x_i ||_1) 6.118383e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.688275e-02 (SUCCESS) Start 420: shm_example_simple_lap_d_facto2_sched0_kway_rqrcpbegin Test #426: shm_example_simple_lap_d_facto2_sched0_kway_tqrcpbegin ..................***Timeout 209.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 426: shm_example_simple_lap_d_facto2_sched0_kway_tqrcpbegin Test #436: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpilu0 ...................***Timeout 209.60 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 436: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpilu0 Test #445: shm_example_simple_lap_c_facto0_sched0_not_pqrcpend .....................***Timeout 209.84 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 445: shm_example_simple_lap_c_facto0_sched0_not_pqrcpend Test #450: shm_example_simple_lap_c_facto0_sched0_not_rqrcpbegin ...................***Timeout 210.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 450: shm_example_simple_lap_c_facto0_sched0_not_rqrcpbegin Test #453: shm_example_simple_lap_c_facto0_sched0_kway_rqrcpend ....................***Timeout 210.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.630026e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.523268e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.832201e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 7.393908e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.722278e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.329751e-03 s Time to initialize coeftab 4.500789e-02 s Time to factorize 7.193657e-01 s (28.19 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.945242e-02 s Time for refinement 7.235536e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.087415e-07 max(|| b_i - A x_i ||_1) 9.200869e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.321654e+00 (SUCCESS) Start 453: shm_example_simple_lap_c_facto0_sched0_kway_rqrcpend Test #460: shm_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpbegin .......***Timeout 210.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.795720e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.703464e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.388213e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.282838e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.016884e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.379133e-03 s Time to initialize coeftab 1.000444e+00 s Time to factorize 1.843555e+00 s (11.00 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.067220e-02 s - iteration 1 : total iteration time 0.0307 s error 5.208e-11 Time for refinement 5.888007e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.500726e-08 max(|| b_i - A x_i ||_1) 3.242960e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.182957e-01 (SUCCESS) Start 460: shm_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpbegin Test #463: shm_example_simple_lap_c_facto0_sched0_not_rqrrtend .....................***Timeout 210.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 463: shm_example_simple_lap_c_facto0_sched0_not_rqrrtend Test #473: shm_example_simple_lap_c_facto1_sched0_kway_svdend ......................***Timeout 210.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 473: shm_example_simple_lap_c_facto1_sched0_kway_svdend Test #474: shm_example_simple_lap_c_facto1_sched0_kwayprojections_svdbegin .........***Timeout 210.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 474: shm_example_simple_lap_c_facto1_sched0_kwayprojections_svdbegin Test #476: shm_example_simple_lap_c_facto1_sched0_not_pqrcpbegin ...................***Timeout 210.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 476: shm_example_simple_lap_c_facto1_sched0_not_pqrcpbegin Test #478: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpbegin ..................***Timeout 210.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.468022e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.531717e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.278398e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.288108e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.634184e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.953007e-03 s Time to initialize coeftab 5.484213e-01 s Time to factorize 9.210633e-01 s (23.13 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.022553e-02 s - iteration 1 : total iteration time 0.0284 s error 2.0326e-11 Time for refinement 4.969804e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.308260e-08 max(|| b_i - A x_i ||_1) 3.117375e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.866070e-01 (SUCCESS) Start 478: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpbegin Test #484: shm_example_simple_lap_c_facto1_sched0_kway_rqrcpbegin ..................***Timeout 210.67 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 484: shm_example_simple_lap_c_facto1_sched0_kway_rqrcpbegin Test #488: shm_example_simple_lap_c_facto1_sched0_not_tqrcpbegin ...................***Timeout 210.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 488: shm_example_simple_lap_c_facto1_sched0_not_tqrcpbegin Test #490: shm_example_simple_lap_c_facto1_sched0_kway_tqrcpbegin ..................***Timeout 210.72 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 490: shm_example_simple_lap_c_facto1_sched0_kway_tqrcpbegin Test #492: shm_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpbegin .......***Timeout 210.79 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 492: shm_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpbegin Test #493: shm_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpend .........***Timeout 210.88 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 493: shm_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpend Test #494: shm_example_simple_lap_c_facto1_sched0_not_rqrrtbegin ...................***Timeout 211.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.474662e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.244546e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.541328e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.277917e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.621424e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.348179e-03 s Time to initialize coeftab 1.555980e+00 s Time to factorize 1.199276e+00 s (17.77 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.970586e-02 s - iteration 1 : total iteration time 0.0278 s error 5.1961e-11 Time for refinement 3.735870e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.866677e-08 max(|| b_i - A x_i ||_1) 3.344820e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.439981e-01 (SUCCESS) Start 494: shm_example_simple_lap_c_facto1_sched0_not_rqrrtbegin Test #502: shm_example_simple_lap_c_facto2_sched0_not_svdbegin .....................***Timeout 211.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 502: shm_example_simple_lap_c_facto2_sched0_not_svdbegin 527/3626 Test #735: shm_example_simple_lap_z_facto4_sched0_kway_pqrcpend ....................***Timeout 211.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 735: shm_example_simple_lap_z_facto4_sched0_kway_pqrcpend 527/3626 Test #737: shm_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpend .........***Timeout 211.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 737: shm_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpend 527/3626 Test #738: shm_example_simple_lap_z_facto4_sched0_not_rqrcpbegin ...................***Timeout 211.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.572602e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.835052e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.330767e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.601049e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.413646e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.236259e-01 s Time to initialize coeftab 1.132258e+00 s Time to factorize 1.467126e+00 s (14.52 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 1.364853e-02 s - iteration 1 : total iteration time 0.0132 s error 1.7822e-14 Time for refinement 2.215449e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.781995e-14 max(|| b_i - A x_i ||_1) 2.336979e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.896989e-02 (SUCCESS) Start 738: shm_example_simple_lap_z_facto4_sched0_not_rqrcpbegin 527/3626 Test #739: shm_example_simple_lap_z_facto4_sched0_not_rqrcpend .....................***Timeout 211.73 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 739: shm_example_simple_lap_z_facto4_sched0_not_rqrcpend 527/3626 Test #740: shm_example_simple_lap_z_facto4_sched0_kway_rqrcpbegin ..................***Timeout 211.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 740: shm_example_simple_lap_z_facto4_sched0_kway_rqrcpbegin 527/3626 Test #741: shm_example_simple_lap_z_facto4_sched0_kway_rqrcpend ....................***Timeout 211.92 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 741: shm_example_simple_lap_z_facto4_sched0_kway_rqrcpend 527/3626 Test #742: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpbegin .......***Timeout 211.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 742: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpbegin 527/3626 Test #743: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpend .........***Timeout 212.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.102731e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.682264e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.483757e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.717572e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.492723e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.263780e-02 s Time to initialize coeftab 9.392337e-02 s Time to factorize 3.548914e-01 s (60.04 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 1.286072e-02 s Time for refinement 4.416334e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.742034e-16 max(|| b_i - A x_i ||_1) 1.862444e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.699577e-03 (SUCCESS) Start 743: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpend 527/3626 Test #744: shm_example_simple_lap_z_facto4_sched0_not_tqrcpbegin ...................***Timeout 212.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 744: shm_example_simple_lap_z_facto4_sched0_not_tqrcpbegin 527/3626 Test #745: shm_example_simple_lap_z_facto4_sched0_not_tqrcpend .....................***Timeout 212.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 745: shm_example_simple_lap_z_facto4_sched0_not_tqrcpend 527/3626 Test #746: shm_example_simple_lap_z_facto4_sched0_kway_tqrcpbegin ..................***Timeout 212.64 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 746: shm_example_simple_lap_z_facto4_sched0_kway_tqrcpbegin 527/3626 Test #747: shm_example_simple_lap_z_facto4_sched0_kway_tqrcpend ....................***Timeout 213.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 747: shm_example_simple_lap_z_facto4_sched0_kway_tqrcpend 527/3626 Test #748: shm_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpbegin .......***Timeout 213.16 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 748: shm_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpbegin 527/3626 Test #750: shm_example_simple_lap_z_facto4_sched0_not_rqrrtbegin ...................***Timeout 213.67 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 750: shm_example_simple_lap_z_facto4_sched0_not_rqrrtbegin 527/3626 Test #751: shm_example_simple_lap_z_facto4_sched0_not_rqrrtend .....................***Timeout 213.82 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 751: shm_example_simple_lap_z_facto4_sched0_not_rqrrtend 527/3626 Test #752: shm_example_simple_lap_z_facto4_sched0_kway_rqrrtbegin ..................***Timeout 214.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 752: shm_example_simple_lap_z_facto4_sched0_kway_rqrrtbegin 527/3626 Test #753: shm_example_simple_lap_z_facto4_sched0_kway_rqrrtend ....................***Timeout 214.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 753: shm_example_simple_lap_z_facto4_sched0_kway_rqrrtend 527/3626 Test #754: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtbegin .......***Timeout 214.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 754: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtbegin 527/3626 Test #757: shm_example_simple_lap_z_facto4_sched0_kway_pqrcpilu1 ...................***Timeout 214.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 757: shm_example_simple_lap_z_facto4_sched0_kway_pqrcpilu1 527/3626 Test #758: shm_example_simple_lap_s_facto0_sched1_not_svdbegin .....................***Timeout 214.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 758: shm_example_simple_lap_s_facto0_sched1_not_svdbegin 527/3626 Test #759: shm_example_simple_lap_s_facto0_sched1_not_svdend .......................***Timeout 214.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 759: shm_example_simple_lap_s_facto0_sched1_not_svdend 527/3626 Test #760: shm_example_simple_lap_s_facto0_sched1_kway_svdbegin ....................***Timeout 214.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 760: shm_example_simple_lap_s_facto0_sched1_kway_svdbegin 527/3626 Test #761: shm_example_simple_lap_s_facto0_sched1_kway_svdend ......................***Timeout 214.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 761: shm_example_simple_lap_s_facto0_sched1_kway_svdend 527/3626 Test #763: shm_example_simple_lap_s_facto0_sched1_kwayprojections_svdend ...........***Timeout 214.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 763: shm_example_simple_lap_s_facto0_sched1_kwayprojections_svdend 527/3626 Test #764: shm_example_simple_lap_s_facto0_sched1_not_pqrcpbegin ...................***Timeout 214.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 764: shm_example_simple_lap_s_facto0_sched1_not_pqrcpbegin 527/3626 Test #766: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpbegin ..................***Timeout 214.56 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 766: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpbegin 527/3626 Test #767: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpend ....................***Timeout 214.88 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 767: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpend 527/3626 Test #768: shm_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpbegin .......***Timeout 215.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 768: shm_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpbegin 527/3626 Test #769: shm_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpend .........***Timeout 215.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 769: shm_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpend 527/3626 Test #773: shm_example_simple_lap_s_facto0_sched1_kway_rqrcpend ....................***Timeout 215.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 773: shm_example_simple_lap_s_facto0_sched1_kway_rqrcpend 527/3626 Test #775: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpend .........***Timeout 215.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 775: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpend 527/3626 Test #776: shm_example_simple_lap_s_facto0_sched1_not_tqrcpbegin ...................***Timeout 215.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 776: shm_example_simple_lap_s_facto0_sched1_not_tqrcpbegin 527/3626 Test #783: shm_example_simple_lap_s_facto0_sched1_not_rqrrtend .....................***Timeout 215.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.047810e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.776128e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.625705e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.823277e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.441912e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.759502e-02 s Time to initialize coeftab 4.894043e-02 s Time to factorize 1.126860e+00 s ( 4.49 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 5.425119e-01 s Time for refinement 1.385448e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.989808e-07 max(|| b_i - A x_i ||_1) 8.807955e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.106777e+00 (SUCCESS) Start 783: shm_example_simple_lap_s_facto0_sched1_not_rqrrtend 527/3626 Test #785: shm_example_simple_lap_s_facto0_sched1_kway_rqrrtend ....................***Timeout 215.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 785: shm_example_simple_lap_s_facto0_sched1_kway_rqrrtend 527/3626 Test #801: shm_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpend .........***Timeout 214.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 801: shm_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpend 527/3626 Test #802: shm_example_simple_lap_s_facto1_sched1_not_rqrcpbegin ...................***Timeout 214.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 802: shm_example_simple_lap_s_facto1_sched1_not_rqrcpbegin Test #387: shm_example_simple_lap_d_facto1_sched0_not_rqrcpend ..................... Passed 213.77 sec 528/3626 Test #762: shm_example_simple_lap_s_facto0_sched1_kwayprojections_svdbegin .........***Timeout 215.52 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 762: shm_example_simple_lap_s_facto0_sched1_kwayprojections_svdbegin 528/3626 Test #770: shm_example_simple_lap_s_facto0_sched1_not_rqrcpbegin ...................***Timeout 215.54 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 770: shm_example_simple_lap_s_facto0_sched1_not_rqrcpbegin 528/3626 Test #771: shm_example_simple_lap_s_facto0_sched1_not_rqrcpend .....................***Timeout 215.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 771: shm_example_simple_lap_s_facto0_sched1_not_rqrcpend 528/3626 Test #772: shm_example_simple_lap_s_facto0_sched1_kway_rqrcpbegin ..................***Timeout 215.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 772: shm_example_simple_lap_s_facto0_sched1_kway_rqrcpbegin 528/3626 Test #774: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpbegin .......***Timeout 216.07 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 774: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpbegin 528/3626 Test #777: shm_example_simple_lap_s_facto0_sched1_not_tqrcpend .....................***Timeout 216.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 777: shm_example_simple_lap_s_facto0_sched1_not_tqrcpend 528/3626 Test #778: shm_example_simple_lap_s_facto0_sched1_kway_tqrcpbegin ..................***Timeout 216.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 778: shm_example_simple_lap_s_facto0_sched1_kway_tqrcpbegin 528/3626 Test #779: shm_example_simple_lap_s_facto0_sched1_kway_tqrcpend ....................***Timeout 216.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 779: shm_example_simple_lap_s_facto0_sched1_kway_tqrcpend 528/3626 Test #780: shm_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpbegin .......***Timeout 216.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 780: shm_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpbegin 528/3626 Test #781: shm_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpend .........***Timeout 216.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 781: shm_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpend 528/3626 Test #782: shm_example_simple_lap_s_facto0_sched1_not_rqrrtbegin ...................***Timeout 216.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 782: shm_example_simple_lap_s_facto0_sched1_not_rqrrtbegin 528/3626 Test #784: shm_example_simple_lap_s_facto0_sched1_kway_rqrrtbegin ..................***Timeout 216.20 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 784: shm_example_simple_lap_s_facto0_sched1_kway_rqrrtbegin 528/3626 Test #786: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtbegin .......***Timeout 216.20 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 786: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtbegin 528/3626 Test #787: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtend .........***Timeout 216.31 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 787: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtend 528/3626 Test #788: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpilu0 ...................***Timeout 216.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 788: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpilu0 528/3626 Test #789: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpilu1 ...................***Timeout 216.67 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 789: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpilu1 528/3626 Test #790: shm_example_simple_lap_s_facto1_sched1_not_svdbegin .....................***Timeout 216.89 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 790: shm_example_simple_lap_s_facto1_sched1_not_svdbegin 528/3626 Test #791: shm_example_simple_lap_s_facto1_sched1_not_svdend .......................***Timeout 216.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 791: shm_example_simple_lap_s_facto1_sched1_not_svdend 528/3626 Test #792: shm_example_simple_lap_s_facto1_sched1_kway_svdbegin ....................***Timeout 217.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 792: shm_example_simple_lap_s_facto1_sched1_kway_svdbegin 528/3626 Test #793: shm_example_simple_lap_s_facto1_sched1_kway_svdend ......................***Timeout 217.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 793: shm_example_simple_lap_s_facto1_sched1_kway_svdend 528/3626 Test #794: shm_example_simple_lap_s_facto1_sched1_kwayprojections_svdbegin .........***Timeout 217.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 794: shm_example_simple_lap_s_facto1_sched1_kwayprojections_svdbegin 528/3626 Test #795: shm_example_simple_lap_s_facto1_sched1_kwayprojections_svdend ...........***Timeout 217.40 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 795: shm_example_simple_lap_s_facto1_sched1_kwayprojections_svdend 528/3626 Test #796: shm_example_simple_lap_s_facto1_sched1_not_pqrcpbegin ...................***Timeout 217.43 sec Start 796: shm_example_simple_lap_s_facto1_sched1_not_pqrcpbegin 528/3626 Test #797: shm_example_simple_lap_s_facto1_sched1_not_pqrcpend .....................***Timeout 217.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 797: shm_example_simple_lap_s_facto1_sched1_not_pqrcpend 528/3626 Test #798: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpbegin ..................***Timeout 217.90 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 798: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpbegin 528/3626 Test #799: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpend ....................***Timeout 218.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 799: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpend 528/3626 Test #800: shm_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpbegin .......***Timeout 218.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 800: shm_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpbegin Test #17: c_shm_example_simple_lap_s_facto0 .......................................***Timeout 219.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.862762e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.738426e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.861908e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.213762e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.051021e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.259548e-03 s Time to initialize coeftab 1.834130e-01 s Time to factorize 2.028514e+00 s ( 2.50 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 9.315380e-01 s Time for refinement 6.444546e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.033099e-07 max(|| b_i - A x_i ||_1) 8.840329e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.110845e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 3.874302e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.754641e-01 (SUCCESS) Start 17: c_shm_example_simple_lap_s_facto0 Test #20: c_shm_example_simple_lap_d_facto0 .......................................***Timeout 219.35 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.199717e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.488546e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.866648e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 3.178500e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.575820e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.368722e-03 s Time to initialize coeftab 6.442261e-02 s Time to factorize 1.161038e+00 s ( 4.36 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 2.089699e+00 s Time for refinement 4.031479e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.619748e-16 max(|| b_i - A x_i ||_1) 1.918394e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.410627e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.332268e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.666611e-03 (SUCCESS) Start 20: c_shm_example_simple_lap_d_facto0 Test #30: c_shm_example_simple_lap_z_facto2 .......................................***Timeout 219.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.345410e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.666697e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.450424e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.252664e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.692417e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.068700e-02 s Time to initialize coeftab 9.567599e-02 s Time to factorize 3.804311e+00 s (10.51 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 2.49 Mo Time to solve 2.084234e+00 s Time for refinement 5.675094e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.698768e-16 max(|| b_i - A x_i ||_1) 1.764818e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.453233e-03 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 1.516176e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.222395e-03 (SUCCESS) Start 30: c_shm_example_simple_lap_z_facto2 Test #34: c_shm_example_simple_solve_and_refine_lap_s_facto1 ......................***Timeout 219.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.233562e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.242036e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.627795e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.130631e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.825362e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.129142e-03 s Time to initialize coeftab 2.062256e-01 s Time to factorize 2.098705e+00 s ( 2.49 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 1.134540e+00 s Time for refinement 5.122628e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.955306e-07 max(|| b_i - A x_i ||_1) 8.404832e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.056122e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 3.576279e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.158130e-01 (SUCCESS) Start 34: c_shm_example_simple_solve_and_refine_lap_s_facto1 Test #46: c_shm_example_simple_solve_and_refine_lap_z_facto2 ......................***Timeout 219.65 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 46: c_shm_example_simple_solve_and_refine_lap_z_facto2 Test #47: c_shm_example_simple_solve_and_refine_lap_z_facto3 ......................***Timeout 219.83 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.200188e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.701483e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.413386e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.447646e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.839488e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.293883e-02 s Time to initialize coeftab 1.891081e-01 s Time to factorize 3.900600e+00 s ( 5.20 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.770586e+00 s Time for refinement 4.195816e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.963120e-16 max(|| b_i - A x_i ||_1) 2.000152e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.047062e-03 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 1.542371e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.260791e-03 (SUCCESS) Start 47: c_shm_example_simple_solve_and_refine_lap_z_facto3 Test #49: c_shm_example_simple_trans_lap_s_facto0 .................................***Timeout 219.90 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.600274e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.741216e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.849486e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 5.896535e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.724827e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.063171e-03 s Time to initialize coeftab 1.638688e-01 s Time to factorize 2.006797e+00 s ( 2.52 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 2.318747e+00 s Time for refinement 1.504612e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.009367e-07 max(|| b_i - A x_i ||_1) 8.835131e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.110192e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 3.874302e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.754641e-01 (SUCCESS) Start 49: c_shm_example_simple_trans_lap_s_facto0 Test #54: c_shm_example_simple_trans_lap_d_facto2 .................................***Timeout 219.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 54: c_shm_example_simple_trans_lap_d_facto2 Test #65: c_shm_example_step-by-step_lap_s_facto0 .................................***Timeout 220.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 65: c_shm_example_step-by-step_lap_s_facto0 Test #66: c_shm_example_step-by-step_lap_s_facto1 .................................***Timeout 220.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.597882e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.788100e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.112285e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.282484e-01 s Time to initialize internal csc 3.275179e-03 s Time to initialize coeftab 9.838680e-02 s Time to factorize 1.138413e+00 s ( 4.60 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 2.144448e+00 s Time for refinement 3.612852e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.176381e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.258137e-07 max(|| b_i - A x_i ||_1) 8.973517e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.142853e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 5.364418e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.074148e+00 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 66: c_shm_example_step-by-step_lap_s_facto1 Test #67: c_shm_example_step-by-step_lap_s_facto2 .................................***Timeout 220.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 67: c_shm_example_step-by-step_lap_s_facto2 Test #68: c_shm_example_step-by-step_lap_d_facto0 .................................***Timeout 220.36 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 68: c_shm_example_step-by-step_lap_d_facto0 Test #69: c_shm_example_step-by-step_lap_d_facto1 .................................***Timeout 220.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 69: c_shm_example_step-by-step_lap_d_facto1 Test #70: c_shm_example_step-by-step_lap_d_facto2 .................................***Timeout 220.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 70: c_shm_example_step-by-step_lap_d_facto2 Test #71: c_shm_example_step-by-step_lap_c_facto0 .................................***Timeout 220.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 71: c_shm_example_step-by-step_lap_c_facto0 Test #72: c_shm_example_step-by-step_lap_c_facto1 .................................***Timeout 220.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 72: c_shm_example_step-by-step_lap_c_facto1 Test #73: c_shm_example_step-by-step_lap_c_facto2 .................................***Timeout 220.45 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 73: c_shm_example_step-by-step_lap_c_facto2 Test #74: c_shm_example_step-by-step_lap_c_facto3 .................................***Timeout 220.65 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.565876e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.688099e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.037573e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.495784e-01 s Time to initialize internal csc 1.565066e-02 s Time to initialize coeftab 1.910146e-01 s Time to factorize 2.900208e+00 s ( 6.99 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.917479e+00 s Time for refinement 5.748480e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.262384e-07 max(|| b_i - A x_i ||_1) 9.457573e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.404992e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 5.960464e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 8.559307e-01 (SUCCESS) Start 74: c_shm_example_step-by-step_lap_c_facto3 Test #75: c_shm_example_step-by-step_lap_c_facto4 .................................***Timeout 220.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 75: c_shm_example_step-by-step_lap_c_facto4 Test #76: c_shm_example_step-by-step_lap_z_facto0 .................................***Timeout 220.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 76: c_shm_example_step-by-step_lap_z_facto0 Test #77: c_shm_example_step-by-step_lap_z_facto1 .................................***Timeout 220.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 77: c_shm_example_step-by-step_lap_z_facto1 Test #78: c_shm_example_step-by-step_lap_z_facto2 .................................***Timeout 221.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 78: c_shm_example_step-by-step_lap_z_facto2 Test #79: c_shm_example_step-by-step_lap_z_facto3 .................................***Timeout 221.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.565957e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.274207e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.555008e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 5.222797e-01 s Time to initialize internal csc 2.568282e-01 s Time to initialize coeftab 1.403295e-01 s Time to factorize 2.567798e+00 s ( 7.90 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 2.228919e+00 s Time for refinement 3.873509e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.054769e-16 max(|| b_i - A x_i ||_1) 2.026171e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.112717e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.613647e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.295673e-03 (SUCCESS) Start 79: c_shm_example_step-by-step_lap_z_facto3 Test #80: c_shm_example_step-by-step_lap_z_facto4 .................................***Timeout 221.32 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.449348e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.987946e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.343894e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.393572e-01 s Time to initialize internal csc 8.986415e-03 s Time to initialize coeftab 2.957557e-01 s Time to factorize 1.271558e+00 s (16.76 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 2.112500e+00 s Time for refinement 7.057385e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.804001e-16 max(|| b_i - A x_i ||_1) 1.852518e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.713856e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.413083e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.071282e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 80: c_shm_example_step-by-step_lap_z_facto4 Test #130: c_shm_example_step-by-step_single_mm ....................................***Timeout 222.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: IJV N: 841 nnz: 2465 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.005009e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 24350 Fill-in of L 9.878296 Time to compute symbol matrix 6.692234e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.170085e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 24350 Fill-in 9.878296 Number of operations in full-rank: LDL^t 3.58 MFlops Prediction: Model AMD 6180 MKL Time to factorize 2.058387e-04 s Time for mapping/scheduling 1.149089e-01 s Time to initialize internal csc 4.244927e-03 s Time to initialize coeftab 1.683571e-01 s Time to factorize 1.013954e+00 s ( 3.53 MFlop/s) Number of operations 4.26 MFlops Number of static pivots 0 Memory usage of coeftab 518 Ko Time to solve 2.193213e+00 s Time for refinement 2.510910e+00 s || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.462448e-15 max(|| b_i - A x_i ||_1) 1.560264e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.744403e-03 (SUCCESS) Start 130: c_shm_example_step-by-step_single_mm Test #131: c_shm_example_step-by-step_single_hb ....................................***Timeout 222.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 131: c_shm_example_step-by-step_single_hb Test #132: c_shm_example_step-by-step_single_mm2 ...................................***Timeout 222.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: IJV N: 1280 nnz: 12029 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.420350e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 10749 Fill-in of L 0.893590 Time to compute symbol matrix 1.605522e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.221797e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 21498 Fill-in 1.787181 Number of operations in full-rank: LU 1.08 MFlops Prediction: Model AMD 6180 MKL Time to factorize 8.513996e-04 s Time for mapping/scheduling 9.041272e-01 s Time to initialize internal csc 3.771432e-01 s Time to initialize coeftab 1.009880e-01 s Time to factorize 9.515888e-01 s ( 1.14 MFlop/s) Number of operations 1.20 MFlops Number of static pivots 0 Memory usage of coeftab 510 Ko Time to solve 1.591441e+00 s Time for refinement 2.483583e+00 s || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081190e-16 max(|| b_i - A x_i ||_1) 1.159113e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.416055e-04 (SUCCESS) Start 132: c_shm_example_step-by-step_single_mm2 Test #136: c_shm_example_refinement_lap_s_refine_cg_sym ............................***Timeout 222.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 136: c_shm_example_refinement_lap_s_refine_cg_sym Test #137: c_shm_example_refinement_lap_s_refine_gmres_sym .........................***Timeout 222.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 137: c_shm_example_refinement_lap_s_refine_gmres_sym Test #138: c_shm_example_refinement_lap_s_refine_bicgstab_sym ......................***Timeout 222.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 138: c_shm_example_refinement_lap_s_refine_bicgstab_sym Test #139: c_shm_example_refinement_lap_d_refine_cg_sym ............................***Timeout 222.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 139: c_shm_example_refinement_lap_d_refine_cg_sym Test #140: c_shm_example_refinement_lap_d_refine_gmres_sym .........................***Timeout 222.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 140: c_shm_example_refinement_lap_d_refine_gmres_sym Test #141: c_shm_example_refinement_lap_d_refine_bicgstab_sym ......................***Timeout 222.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 141: c_shm_example_refinement_lap_d_refine_bicgstab_sym Test #142: c_shm_example_refinement_lap_c_refine_cg_her ............................***Timeout 222.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex32 Format: CSC N: 1000 nnz: 11476 Start 142: c_shm_example_refinement_lap_c_refine_cg_her Test #143: c_shm_example_refinement_lap_c_refine_gmres_her .........................***Timeout 222.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex32 Format: CSC N: 1000 nnz: 11476 Start 143: c_shm_example_refinement_lap_c_refine_gmres_her Test #144: c_shm_example_refinement_lap_c_refine_bicgstab_her ......................***Timeout 222.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex32 Format: CSC N: 1000 nnz: 11476 Start 144: c_shm_example_refinement_lap_c_refine_bicgstab_her Test #145: c_shm_example_refinement_lap_c_refine_cg_sym ............................***Timeout 222.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 145: c_shm_example_refinement_lap_c_refine_cg_sym Test #146: c_shm_example_refinement_lap_c_refine_gmres_sym .........................***Timeout 222.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 146: c_shm_example_refinement_lap_c_refine_gmres_sym Test #147: c_shm_example_refinement_lap_c_refine_bicgstab_sym ......................***Timeout 222.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 147: c_shm_example_refinement_lap_c_refine_bicgstab_sym Test #148: c_shm_example_refinement_lap_z_refine_cg_her ............................***Timeout 222.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 Start 148: c_shm_example_refinement_lap_z_refine_cg_her Test #150: c_shm_example_refinement_lap_z_refine_bicgstab_her ......................***Timeout 222.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 Start 150: c_shm_example_refinement_lap_z_refine_bicgstab_her Test #151: c_shm_example_refinement_lap_z_refine_cg_sym ............................***Timeout 222.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 151: c_shm_example_refinement_lap_z_refine_cg_sym Test #153: c_shm_example_refinement_lap_z_refine_bicgstab_sym ......................***Timeout 222.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 153: c_shm_example_refinement_lap_z_refine_bicgstab_sym Test #159: c_shm_example_simple_mixed_lap_d_facto2 .................................***Timeout 222.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 159: c_shm_example_simple_mixed_lap_d_facto2 Test #161: c_shm_example_simple_mixed_lap_d_refine_gmres_sym .......................***Timeout 222.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 161: c_shm_example_simple_mixed_lap_d_refine_gmres_sym Test #164: c_shm_example_simple_mixed_lap_z_facto1 .................................***Timeout 222.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.844034e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.938752e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.390226e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.565187e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.590591e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.245910e-02 s Time to initialize coeftab 1.570051e-01 s Time to factorize 2.761522e+00 s ( 7.72 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 9.723675e-01 s - iteration 1 : total iteration time 1.33 s error 5.9225e-14 Time for refinement 2.338749e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.922289e-14 max(|| b_i - A x_i ||_1) 1.896631e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.785842e-01 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 2.064173e-13 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 3.025642e-01 (SUCCESS) Start 164: c_shm_example_simple_mixed_lap_z_facto1 Test #167: c_shm_example_simple_mixed_lap_z_facto4 .................................***Timeout 222.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 167: c_shm_example_simple_mixed_lap_z_facto4 Test #168: c_shm_example_simple_mixed_lap_z_refine_cg_her ..........................***Timeout 222.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 Start 168: c_shm_example_simple_mixed_lap_z_refine_cg_her Test #170: c_shm_example_simple_mixed_lap_z_refine_bicgstab_her ....................***Timeout 222.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 Start 170: c_shm_example_simple_mixed_lap_z_refine_bicgstab_her Test #173: c_shm_example_simple_mixed_lap_z_refine_bicgstab_sym ....................***Timeout 222.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 173: c_shm_example_simple_mixed_lap_z_refine_bicgstab_sym Test #191: shm_example_simple_lap_s_facto1_sched1_1d ...............................***Timeout 222.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 191: shm_example_simple_lap_s_facto1_sched1_1d Test #192: shm_example_simple_lap_s_facto2_sched1_1d ...............................***Timeout 222.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 192: shm_example_simple_lap_s_facto2_sched1_1d Test #199: shm_example_simple_lap_c_facto3_sched1_1d ...............................***Timeout 222.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.702565e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.912669e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.762167e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.536070e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.445713e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.143550e-02 s Time to initialize coeftab 3.180613e-02 s Time to factorize 2.436707e+00 s ( 8.32 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 5.743942e-01 s Time for refinement 1.595309e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.036963e-07 max(|| b_i - A x_i ||_1) 9.070209e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.288685e+00 (SUCCESS) Start 199: shm_example_simple_lap_c_facto3_sched1_1d Test #206: shm_example_simple_lap_s_facto0_sched4_1d ...............................***Timeout 222.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.136909e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.818680e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.420202e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.768010e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.207394e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.267587e-03 s Time to initialize coeftab 3.258683e-02 s Time to factorize 1.303439e+00 s ( 3.88 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Memory usage of coeftab 319 Ko Time to solve 8.753961e-01 s Time for refinement 3.380211e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.968910e-07 max(|| b_i - A x_i ||_1) 8.620780e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.083257e+00 (SUCCESS) Start 206: shm_example_simple_lap_s_facto0_sched4_1d Test #207: shm_example_simple_lap_s_facto1_sched4_1d ...............................***Timeout 222.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 207: shm_example_simple_lap_s_facto1_sched4_1d Test #212: shm_example_simple_lap_c_facto0_sched4_1d ...............................***Timeout 222.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 212: shm_example_simple_lap_c_facto0_sched4_1d Test #217: shm_example_simple_lap_z_facto0_sched4_1d ...............................***Timeout 222.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.128553e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.965374e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.081526e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.761576e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.828591e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.472315e-03 s Time to initialize coeftab 4.404139e-02 s Time to factorize 2.543929e+00 s ( 7.97 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 9.263399e-01 s Time for refinement 8.429266e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.013215e-16 max(|| b_i - A x_i ||_1) 2.016739e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.088915e-03 (SUCCESS) Start 217: shm_example_simple_lap_z_facto0_sched4_1d Test #221: shm_example_simple_lap_z_facto4_sched4_1d ...............................***Timeout 222.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.540587e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.485935e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.504185e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.421339e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.006809e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.584330e-02 s Time to initialize coeftab 9.749518e-02 s Time to factorize 3.164961e+00 s ( 6.73 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 9.373263e-01 s Time for refinement 5.618516e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.753066e-16 max(|| b_i - A x_i ||_1) 1.851421e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.671763e-03 (SUCCESS) Start 221: shm_example_simple_lap_z_facto4_sched4_1d Test #248: shm_example_simple_lap_s_facto0_sched0_kway_svdbegin ....................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.740011e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.893010e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.926049e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 7.571061e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.934854e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.739655e-01 s Time to initialize coeftab 5.115348e-01 s Time to factorize 6.605248e-01 s ( 7.66 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 5.521028e-03 s Time for refinement 3.037330e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.205707e-07 max(|| b_i - A x_i ||_1) 9.494804e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.193084e+00 (SUCCESS) Start 248: shm_example_simple_lap_s_facto0_sched0_kway_svdbegin Test #278: shm_example_simple_lap_s_facto1_sched0_not_svdbegin .....................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.591393e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.422175e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.480226e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.195666e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.909195e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.009614e-01 s Time to initialize coeftab 6.226783e-01 s Time to factorize 6.448785e-01 s ( 8.12 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 5.825710e-03 s Time for refinement 3.007309e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.249276e-07 max(|| b_i - A x_i ||_1) 9.406115e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.181939e+00 (SUCCESS) Start 278: shm_example_simple_lap_s_facto1_sched0_not_svdbegin Test #284: shm_example_simple_lap_s_facto1_sched0_not_pqrcpbegin ...................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 284: shm_example_simple_lap_s_facto1_sched0_not_pqrcpbegin Test #290: shm_example_simple_lap_s_facto1_sched0_not_rqrcpbegin ...................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.210954e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.346890e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.833020e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.360566e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.478852e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.371421e-03 s Time to initialize coeftab 6.678785e-01 s Time to factorize 3.514575e+00 s ( 1.49 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 5.984296e-03 s - iteration 1 : total iteration time 0.0039 s error 3.3713e-11 Time for refinement 1.027297e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.805603e-08 max(|| b_i - A x_i ||_1) 2.876591e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.614624e-01 (SUCCESS) Start 290: shm_example_simple_lap_s_facto1_sched0_not_rqrcpbegin Test #295: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpend .........***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 295: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpend Test #298: shm_example_simple_lap_s_facto1_sched0_kway_tqrcpbegin ..................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.077791e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.706455e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.536049e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.393046e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.435502e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.154795e-03 s Time to initialize coeftab 3.994648e-01 s Time to factorize 4.358150e-01 s (12.01 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 5.490177e-03 s - iteration 1 : total iteration time 0.00373 s error 3.3727e-11 Time for refinement 1.024203e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.806737e-08 max(|| b_i - A x_i ||_1) 2.878308e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.616781e-01 (SUCCESS) Start 298: shm_example_simple_lap_s_facto1_sched0_kway_tqrcpbegin Test #308: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpilu0 ...................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.810437e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.017978e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.746638e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.955279e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.003006e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.154893e-03 s Time to initialize coeftab 9.805653e-01 s Time to factorize 1.735991e+00 s ( 3.01 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 1.530355e-02 s Time for refinement 6.185924e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.287548e-07 max(|| b_i - A x_i ||_1) 9.879076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.241370e+00 (SUCCESS) Start 308: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpilu0 Test #310: shm_example_simple_lap_s_facto2_sched0_not_svdbegin .....................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.118542e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.446610e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.783850e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.570946e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.638353e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.205222e-02 s Time to initialize coeftab 5.336382e-01 s Time to factorize 2.252545e+00 s ( 4.43 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 5.080131e-03 s Time for refinement 6.852921e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.138846e-07 max(|| b_i - A x_i ||_1) 9.166482e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.151828e+00 (SUCCESS) Start 310: shm_example_simple_lap_s_facto2_sched0_not_svdbegin Test #311: shm_example_simple_lap_s_facto2_sched0_not_svdend .......................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 311: shm_example_simple_lap_s_facto2_sched0_not_svdend Test #315: shm_example_simple_lap_s_facto2_sched0_kwayprojections_svdend ...........***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 315: shm_example_simple_lap_s_facto2_sched0_kwayprojections_svdend Test #316: shm_example_simple_lap_s_facto2_sched0_not_pqrcpbegin ...................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.622561e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.070293e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.442435e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.015794e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.699606e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.280526e-01 s Time to initialize coeftab 6.334877e-01 s Time to factorize 5.727970e+00 s ( 1.74 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 5.692315e-03 s - iteration 1 : total iteration time 0.00398 s error 1.1332e-11 Time for refinement 8.010126e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.645004e-08 max(|| b_i - A x_i ||_1) 2.844144e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.573851e-01 (SUCCESS) Start 316: shm_example_simple_lap_s_facto2_sched0_not_pqrcpbegin Test #322: shm_example_simple_lap_s_facto2_sched0_not_rqrcpbegin ...................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 322: shm_example_simple_lap_s_facto2_sched0_not_rqrcpbegin Test #325: shm_example_simple_lap_s_facto2_sched0_kway_rqrcpend ....................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.313982e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.694114e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.158840e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.091905e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.658232e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.717428e-03 s Time to initialize coeftab 2.095559e-01 s Time to factorize 2.021673e-01 s (49.39 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 4.987057e-03 s Time for refinement 3.018519e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.893975e-07 max(|| b_i - A x_i ||_1) 8.091629e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.016766e+00 (SUCCESS) Start 325: shm_example_simple_lap_s_facto2_sched0_kway_rqrcpend Test #333: shm_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpend .........***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.756494e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.185130e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.778753e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.739528e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.061295e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.604165e-01 s Time to initialize coeftab 1.846898e-01 s Time to factorize 8.975136e-01 s (11.12 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 2.404794e-02 s Time for refinement 1.429384e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.893975e-07 max(|| b_i - A x_i ||_1) 8.091629e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.016766e+00 (SUCCESS) Start 333: shm_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpend Test #335: shm_example_simple_lap_s_facto2_sched0_not_rqrrtend .....................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.120062e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.531776e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.524892e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 8.808915e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.248380e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.865480e-03 s Time to initialize coeftab 6.702952e-02 s Time to factorize 1.811893e+00 s ( 5.51 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 5.018538e-03 s Time for refinement 3.135924e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.893975e-07 max(|| b_i - A x_i ||_1) 8.091629e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.016766e+00 (SUCCESS) Start 335: shm_example_simple_lap_s_facto2_sched0_not_rqrrtend Test #373: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpilu1 ...................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.053103e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.647782e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.542828e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 3.890781e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.093843e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.306320e-03 s Time to initialize coeftab 3.362135e-01 s Time to factorize 1.482590e-01 s (34.14 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.357930e-02 s - iteration 1 : total iteration time 0.0147 s error 2.3229e-15 Time for refinement 2.629122e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.328177e-15 max(|| b_i - A x_i ||_1) 1.331069e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.672602e-03 (SUCCESS) Start 373: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpilu1 Test #374: shm_example_simple_lap_d_facto1_sched0_not_svdbegin .....................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.978494e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.314735e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.766156e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.098886e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.136536e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.927119e-01 s Time to initialize coeftab 7.811258e-01 s Time to factorize 9.017994e-01 s ( 5.80 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.158618e-02 s - iteration 1 : total iteration time 0.00668 s error 1.5585e-14 Time for refinement 1.699781e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.558197e-14 max(|| b_i - A x_i ||_1) 2.715471e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412222e-02 (SUCCESS) Start 374: shm_example_simple_lap_d_facto1_sched0_not_svdbegin Test #381: shm_example_simple_lap_d_facto1_sched0_not_pqrcpend .....................***Timeout 222.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.078211e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.264485e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.232991e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.163953e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.107525e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.387831e-03 s Time to initialize coeftab 7.756706e-02 s Time to factorize 1.525549e-01 s (34.31 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.168150e-02 s Time for refinement 6.035619e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.583590e-16 max(|| b_i - A x_i ||_1) 1.860623e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.338032e-03 (SUCCESS) Start 381: shm_example_simple_lap_d_facto1_sched0_not_pqrcpend Test #395: shm_example_simple_lap_d_facto1_sched0_kway_tqrcpend ....................***Timeout 222.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.766099e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.504241e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.553654e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.813906e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.081924e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.166546e-01 s Time to initialize coeftab 1.009528e-01 s Time to factorize 1.706032e+00 s ( 3.07 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.479983e-02 s Time for refinement 7.624511e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.583590e-16 max(|| b_i - A x_i ||_1) 1.860623e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.338032e-03 (SUCCESS) Start 395: shm_example_simple_lap_d_facto1_sched0_kway_tqrcpend Test #406: shm_example_simple_lap_d_facto2_sched0_not_svdbegin .....................***Timeout 222.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 406: shm_example_simple_lap_d_facto2_sched0_not_svdbegin Test #415: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpend ....................***Timeout 222.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.960114e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.178330e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.003549e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.901616e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.072897e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.348981e-01 s Time to initialize coeftab 7.346580e-02 s Time to factorize 1.484349e+00 s ( 6.73 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 4.984737e-03 s Time for refinement 3.071862e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.643681e-16 max(|| b_i - A x_i ||_1) 1.795106e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.255705e-03 (SUCCESS) Start 415: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpend Test #418: shm_example_simple_lap_d_facto2_sched0_not_rqrcpbegin ...................***Timeout 222.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.418673e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.355907e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.231168e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.945688e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.874099e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.328932e-03 s Time to initialize coeftab 8.293566e-01 s Time to factorize 7.097672e+00 s ( 1.41 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 9.829548e-03 s - iteration 1 : total iteration time 0.0103 s error 5.5242e-14 Time for refinement 1.870585e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.524210e-14 max(|| b_i - A x_i ||_1) 6.118383e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.688275e-02 (SUCCESS) Start 418: shm_example_simple_lap_d_facto2_sched0_not_rqrcpbegin Test #430: shm_example_simple_lap_d_facto2_sched0_not_rqrrtbegin ...................***Timeout 222.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 430: shm_example_simple_lap_d_facto2_sched0_not_rqrrtbegin Test #434: shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtbegin .......***Timeout 222.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.468686e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.947414e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.428344e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 5.072141e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.541042e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.322781e-03 s Time to initialize coeftab 9.029260e-01 s Time to factorize 7.633441e-01 s (13.08 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 5.241697e-03 s - iteration 1 : total iteration time 0.00387 s error 1.008e-12 - iteration 2 : total iteration time 0.00355 s error 4.2236e-18 Time for refinement 1.160700e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.255520e-16 max(|| b_i - A x_i ||_1) 6.250679e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.854516e-04 (SUCCESS) Start 434: shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtbegin Test #438: shm_example_simple_lap_c_facto0_sched0_not_svdbegin .....................***Timeout 222.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 438: shm_example_simple_lap_c_facto0_sched0_not_svdbegin Test #439: shm_example_simple_lap_c_facto0_sched0_not_svdend .......................***Timeout 222.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 439: shm_example_simple_lap_c_facto0_sched0_not_svdend Test #447: shm_example_simple_lap_c_facto0_sched0_kway_pqrcpend ....................***Timeout 222.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.537584e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.614861e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.240721e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 3.436128e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.638589e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.809575e-01 s Time to initialize coeftab 3.949611e-01 s Time to factorize 2.820073e-01 s (71.92 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.988748e-02 s Time for refinement 4.340911e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.087415e-07 max(|| b_i - A x_i ||_1) 9.200869e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.321654e+00 (SUCCESS) Start 447: shm_example_simple_lap_c_facto0_sched0_kway_pqrcpend Test #451: shm_example_simple_lap_c_facto0_sched0_not_rqrcpend .....................***Timeout 222.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.646799e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.743311e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.213908e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 5.328877e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.215610e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.493558e-03 s Time to initialize coeftab 1.976418e-01 s Time to factorize 8.605949e-01 s (23.57 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.031144e-02 s Time for refinement 1.027756e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.087415e-07 max(|| b_i - A x_i ||_1) 9.200869e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.321654e+00 (SUCCESS) Start 451: shm_example_simple_lap_c_facto0_sched0_not_rqrcpend Test #452: shm_example_simple_lap_c_facto0_sched0_kway_rqrcpbegin ..................***Timeout 222.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 452: shm_example_simple_lap_c_facto0_sched0_kway_rqrcpbegin Test #459: shm_example_simple_lap_c_facto0_sched0_kway_tqrcpend ....................***Timeout 222.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 459: shm_example_simple_lap_c_facto0_sched0_kway_tqrcpend Test #465: shm_example_simple_lap_c_facto0_sched0_kway_rqrrtend ....................***Timeout 222.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.445956e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.394584e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.860520e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 6.975899e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.178560e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.195533e-01 s Time to initialize coeftab 5.234444e-02 s Time to factorize 1.124369e+00 s (18.04 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.788971e-02 s Time for refinement 7.427993e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.087415e-07 max(|| b_i - A x_i ||_1) 9.200869e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.321654e+00 (SUCCESS) Start 465: shm_example_simple_lap_c_facto0_sched0_kway_rqrrtend Test #466: shm_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtbegin .......***Timeout 222.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 466: shm_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtbegin Test #470: shm_example_simple_lap_c_facto1_sched0_not_svdbegin .....................***Timeout 222.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 470: shm_example_simple_lap_c_facto1_sched0_not_svdbegin Test #482: shm_example_simple_lap_c_facto1_sched0_not_rqrcpbegin ...................***Timeout 222.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 482: shm_example_simple_lap_c_facto1_sched0_not_rqrcpbegin Test #489: shm_example_simple_lap_c_facto1_sched0_not_tqrcpend .....................***Timeout 222.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 489: shm_example_simple_lap_c_facto1_sched0_not_tqrcpend Test #498: shm_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtbegin .......***Timeout 222.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 498: shm_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtbegin Test #500: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpilu0 ...................***Timeout 222.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.775349e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.895649e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.601592e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.219503e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.961933e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.411115e-03 s Time to initialize coeftab 2.485992e-02 s Time to factorize 3.348843e-01 s (63.63 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.560221e-02 s Time for refinement 4.097672e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.112674e-07 max(|| b_i - A x_i ||_1) 1.168162e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.947623e+00 (SUCCESS) Start 500: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpilu0 Test #503: shm_example_simple_lap_c_facto2_sched0_not_svdend .......................***Timeout 222.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 503: shm_example_simple_lap_c_facto2_sched0_not_svdend Test #504: shm_example_simple_lap_c_facto2_sched0_kway_svdbegin ....................***Timeout 222.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.711319e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.566998e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.583630e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.870596e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.923947e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.148277e-02 s Time to initialize coeftab 1.085168e+00 s Time to factorize 4.954734e+00 s ( 8.07 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.518801e-02 s Time for refinement 4.668155e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.157403e-07 max(|| b_i - A x_i ||_1) 9.287170e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.343431e+00 (SUCCESS) Start 504: shm_example_simple_lap_c_facto2_sched0_kway_svdbegin Test #505: shm_example_simple_lap_c_facto2_sched0_kway_svdend ......................***Timeout 222.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.353080e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.390390e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.655364e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.322874e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.777924e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.274156e-02 s Time to initialize coeftab 4.496627e-02 s Time to factorize 2.253949e+00 s (17.73 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.380920e-02 s Time for refinement 4.390853e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.031245e-07 max(|| b_i - A x_i ||_1) 8.570109e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162495e+00 (SUCCESS) Start 505: shm_example_simple_lap_c_facto2_sched0_kway_svdend Test #509: shm_example_simple_lap_c_facto2_sched0_not_pqrcpend .....................***Timeout 222.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.495603e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.666561e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.623960e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.476666e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.878343e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.646642e-03 s Time to initialize coeftab 1.627132e-01 s Time to factorize 5.535850e-01 s (72.20 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 2.771302e-02 s Time for refinement 5.673744e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.031245e-07 max(|| b_i - A x_i ||_1) 8.570109e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162495e+00 (SUCCESS) Start 509: shm_example_simple_lap_c_facto2_sched0_not_pqrcpend Test #510: shm_example_simple_lap_c_facto2_sched0_kway_pqrcpbegin ..................***Timeout 222.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.949456e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.085722e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.696766e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 7.456855e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.108541e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.558890e-03 s Time to initialize coeftab 4.202091e-01 s Time to factorize 1.772939e+00 s (22.54 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.476389e-02 s - iteration 1 : total iteration time 0.017 s error 2.0432e-11 Time for refinement 2.342945e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.307790e-08 max(|| b_i - A x_i ||_1) 3.124648e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.884421e-01 (SUCCESS) Start 510: shm_example_simple_lap_c_facto2_sched0_kway_pqrcpbegin 528/3626 Test #804: shm_example_simple_lap_s_facto1_sched1_kway_rqrcpbegin ..................***Timeout 222.16 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 804: shm_example_simple_lap_s_facto1_sched1_kway_rqrcpbegin 528/3626 Test #806: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpbegin .......***Timeout 222.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 806: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpbegin 528/3626 Test #807: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpend .........***Timeout 222.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 807: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpend 528/3626 Test #808: shm_example_simple_lap_s_facto1_sched1_not_tqrcpbegin ...................***Timeout 222.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 808: shm_example_simple_lap_s_facto1_sched1_not_tqrcpbegin 528/3626 Test #809: shm_example_simple_lap_s_facto1_sched1_not_tqrcpend .....................***Timeout 222.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 809: shm_example_simple_lap_s_facto1_sched1_not_tqrcpend 528/3626 Test #810: shm_example_simple_lap_s_facto1_sched1_kway_tqrcpbegin ..................***Timeout 222.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 810: shm_example_simple_lap_s_facto1_sched1_kway_tqrcpbegin 528/3626 Test #811: shm_example_simple_lap_s_facto1_sched1_kway_tqrcpend ....................***Timeout 222.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 811: shm_example_simple_lap_s_facto1_sched1_kway_tqrcpend 528/3626 Test #812: shm_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpbegin .......***Timeout 222.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 812: shm_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpbegin 528/3626 Test #813: shm_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpend .........***Timeout 222.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 813: shm_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpend 528/3626 Test #814: shm_example_simple_lap_s_facto1_sched1_not_rqrrtbegin ...................***Timeout 222.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 814: shm_example_simple_lap_s_facto1_sched1_not_rqrrtbegin 528/3626 Test #815: shm_example_simple_lap_s_facto1_sched1_not_rqrrtend .....................***Timeout 222.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 815: shm_example_simple_lap_s_facto1_sched1_not_rqrrtend 528/3626 Test #817: shm_example_simple_lap_s_facto1_sched1_kway_rqrrtend ....................***Timeout 222.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 817: shm_example_simple_lap_s_facto1_sched1_kway_rqrrtend 528/3626 Test #818: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtbegin .......***Timeout 222.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 818: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtbegin 528/3626 Test #819: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtend .........***Timeout 222.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 819: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtend 528/3626 Test #820: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpilu0 ...................***Timeout 222.07 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 820: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpilu0 528/3626 Test #821: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpilu1 ...................***Timeout 222.07 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 821: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpilu1 528/3626 Test #822: shm_example_simple_lap_s_facto2_sched1_not_svdbegin .....................***Timeout 222.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 822: shm_example_simple_lap_s_facto2_sched1_not_svdbegin 528/3626 Test #823: shm_example_simple_lap_s_facto2_sched1_not_svdend .......................***Timeout 222.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 823: shm_example_simple_lap_s_facto2_sched1_not_svdend 528/3626 Test #824: shm_example_simple_lap_s_facto2_sched1_kway_svdbegin ....................***Timeout 222.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 824: shm_example_simple_lap_s_facto2_sched1_kway_svdbegin 528/3626 Test #825: shm_example_simple_lap_s_facto2_sched1_kway_svdend ......................***Timeout 222.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 825: shm_example_simple_lap_s_facto2_sched1_kway_svdend 528/3626 Test #826: shm_example_simple_lap_s_facto2_sched1_kwayprojections_svdbegin .........***Timeout 222.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 826: shm_example_simple_lap_s_facto2_sched1_kwayprojections_svdbegin 528/3626 Test #827: shm_example_simple_lap_s_facto2_sched1_kwayprojections_svdend ...........***Timeout 222.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 827: shm_example_simple_lap_s_facto2_sched1_kwayprojections_svdend 528/3626 Test #828: shm_example_simple_lap_s_facto2_sched1_not_pqrcpbegin ...................***Timeout 222.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 828: shm_example_simple_lap_s_facto2_sched1_not_pqrcpbegin 528/3626 Test #829: shm_example_simple_lap_s_facto2_sched1_not_pqrcpend .....................***Timeout 222.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 829: shm_example_simple_lap_s_facto2_sched1_not_pqrcpend 528/3626 Test #830: shm_example_simple_lap_s_facto2_sched1_kway_pqrcpbegin ..................***Timeout 222.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 830: shm_example_simple_lap_s_facto2_sched1_kway_pqrcpbegin 528/3626 Test #831: shm_example_simple_lap_s_facto2_sched1_kway_pqrcpend ....................***Timeout 222.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 831: shm_example_simple_lap_s_facto2_sched1_kway_pqrcpend Start 1039: shm_example_simple_lap_c_facto2_sched1_kway_rqrrtbegin Test #89: c_shm_example_personal_lap_s_facto0 .....................................***Timeout 223.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 89: c_shm_example_personal_lap_s_facto0 Test #90: c_shm_example_personal_lap_s_facto1 .....................................***Timeout 223.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 90: c_shm_example_personal_lap_s_facto1 Test #91: c_shm_example_personal_lap_s_facto2 .....................................***Timeout 223.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 91: c_shm_example_personal_lap_s_facto2 Test #92: c_shm_example_personal_lap_d_facto0 .....................................***Timeout 223.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 92: c_shm_example_personal_lap_d_facto0 Test #93: c_shm_example_personal_lap_d_facto1 .....................................***Timeout 223.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 93: c_shm_example_personal_lap_d_facto1 Test #94: c_shm_example_personal_lap_d_facto2 .....................................***Timeout 223.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 94: c_shm_example_personal_lap_d_facto2 Test #95: c_shm_example_personal_lap_c_facto0 .....................................***Timeout 223.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 95: c_shm_example_personal_lap_c_facto0 Test #96: c_shm_example_personal_lap_c_facto1 .....................................***Timeout 223.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 96: c_shm_example_personal_lap_c_facto1 Test #97: c_shm_example_personal_lap_c_facto2 .....................................***Timeout 223.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 97: c_shm_example_personal_lap_c_facto2 Test #98: c_shm_example_personal_lap_c_facto3 .....................................***Timeout 223.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 98: c_shm_example_personal_lap_c_facto3 Test #99: c_shm_example_personal_lap_c_facto4 .....................................***Timeout 223.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 99: c_shm_example_personal_lap_c_facto4 Test #100: c_shm_example_personal_lap_z_facto0 .....................................***Timeout 223.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 100: c_shm_example_personal_lap_z_facto0 Test #101: c_shm_example_personal_lap_z_facto1 .....................................***Timeout 223.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 101: c_shm_example_personal_lap_z_facto1 Test #102: c_shm_example_personal_lap_z_facto2 .....................................***Timeout 223.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 102: c_shm_example_personal_lap_z_facto2 Test #103: c_shm_example_personal_lap_z_facto3 .....................................***Timeout 223.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 103: c_shm_example_personal_lap_z_facto3 Test #104: c_shm_example_personal_lap_z_facto4 .....................................***Timeout 223.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 104: c_shm_example_personal_lap_z_facto4 Test #121: c_shm_example_simple_scotch_rsa .........................................***Timeout 223.93 sec RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Start 121: c_shm_example_simple_scotch_rsa Test #125: c_shm_example_simple_single_rsa .........................................***Timeout 223.93 sec RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Start 125: c_shm_example_simple_single_rsa Test #129: c_shm_example_step-by-step_single_rsa ...................................***Timeout 223.93 sec RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Start 129: c_shm_example_step-by-step_single_rsa Test #133: c_shm_example_simple_refine_cg ..........................................***Timeout 223.89 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Start 133: c_shm_example_simple_refine_cg Test #149: c_shm_example_refinement_lap_z_refine_gmres_her .........................***Timeout 223.72 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 Start 149: c_shm_example_refinement_lap_z_refine_gmres_her Test #152: c_shm_example_refinement_lap_z_refine_gmres_sym .........................***Timeout 223.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 152: c_shm_example_refinement_lap_z_refine_gmres_sym Test #154: c_shm_example_simple_mixed_refine_cg ....................................***Timeout 223.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Start 154: c_shm_example_simple_mixed_refine_cg Test #410: shm_example_simple_lap_d_facto2_sched0_kwayprojections_svdbegin .........***Timeout 223.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 410: shm_example_simple_lap_d_facto2_sched0_kwayprojections_svdbegin Test #440: shm_example_simple_lap_c_facto0_sched0_kway_svdbegin ....................***Timeout 223.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 440: shm_example_simple_lap_c_facto0_sched0_kway_svdbegin Test #442: shm_example_simple_lap_c_facto0_sched0_kwayprojections_svdbegin .........***Timeout 223.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 442: shm_example_simple_lap_c_facto0_sched0_kwayprojections_svdbegin Test #443: shm_example_simple_lap_c_facto0_sched0_kwayprojections_svdend ...........***Timeout 223.18 sec Start 443: shm_example_simple_lap_c_facto0_sched0_kwayprojections_svdend Test #472: shm_example_simple_lap_c_facto1_sched0_kway_svdbegin ....................***Timeout 223.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 472: shm_example_simple_lap_c_facto1_sched0_kway_svdbegin Test #506: shm_example_simple_lap_c_facto2_sched0_kwayprojections_svdbegin .........***Timeout 223.01 sec Start 506: shm_example_simple_lap_c_facto2_sched0_kwayprojections_svdbegin 528/3626 Test #805: shm_example_simple_lap_s_facto1_sched1_kway_rqrcpend ....................***Timeout 222.86 sec Start 805: shm_example_simple_lap_s_facto1_sched1_kway_rqrcpend 528/3626 Test #816: shm_example_simple_lap_s_facto1_sched1_kway_rqrrtbegin ..................***Timeout 222.68 sec Start 816: shm_example_simple_lap_s_facto1_sched1_kway_rqrrtbegin 528/3626 Test #832: shm_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpbegin ....... Passed 191.81 sec 529/3626 Test #837: shm_example_simple_lap_s_facto2_sched1_kway_rqrcpend .................... Passed 161.29 sec 530/3626 Test #834: shm_example_simple_lap_s_facto2_sched1_not_rqrcpbegin ................... Passed 163.40 sec 531/3626 Test #838: shm_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpbegin ....... Passed 160.37 sec 532/3626 Test #846: shm_example_simple_lap_s_facto2_sched1_not_rqrrtbegin ................... Passed 141.17 sec Start 1040: shm_example_simple_lap_c_facto2_sched1_kway_rqrrtend Start 1041: shm_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtbegin Start 1042: shm_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtend Start 1043: shm_example_simple_lap_c_facto2_sched1_kway_pqrcpilu0 Start 1044: shm_example_simple_lap_c_facto2_sched1_kway_pqrcpilu1 533/3626 Test #847: shm_example_simple_lap_s_facto2_sched1_not_rqrrtend ..................... Passed 139.97 sec Start 1045: shm_example_simple_lap_c_facto3_sched1_not_svdbegin 534/3626 Test #835: shm_example_simple_lap_s_facto2_sched1_not_rqrcpend ..................... Passed 166.48 sec Start 1046: shm_example_simple_lap_c_facto3_sched1_not_svdend 535/3626 Test #848: shm_example_simple_lap_s_facto2_sched1_kway_rqrrtbegin .................. Passed 143.13 sec Start 1047: shm_example_simple_lap_c_facto3_sched1_kway_svdbegin 536/3626 Test #857: shm_example_simple_lap_d_facto0_sched1_kway_svdend ...................... Passed 140.81 sec Start 1048: shm_example_simple_lap_c_facto3_sched1_kway_svdend 537/3626 Test #849: shm_example_simple_lap_s_facto2_sched1_kway_rqrrtend .................... Passed 143.07 sec Start 1049: shm_example_simple_lap_c_facto3_sched1_kwayprojections_svdbegin 538/3626 Test #855: shm_example_simple_lap_d_facto0_sched1_not_svdend ....................... Passed 144.24 sec Start 1050: shm_example_simple_lap_c_facto3_sched1_kwayprojections_svdend 539/3626 Test #843: shm_example_simple_lap_s_facto2_sched1_kway_tqrcpend .................... Passed 154.92 sec Start 1051: shm_example_simple_lap_c_facto3_sched1_not_pqrcpbegin 540/3626 Test #844: shm_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpbegin ....... Passed 155.58 sec Start 1052: shm_example_simple_lap_c_facto3_sched1_not_pqrcpend 541/3626 Test #851: shm_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtend ......... Passed 151.40 sec Start 1053: shm_example_simple_lap_c_facto3_sched1_kway_pqrcpbegin 542/3626 Test #859: shm_example_simple_lap_d_facto0_sched1_kwayprojections_svdend ........... Passed 148.89 sec Start 1054: shm_example_simple_lap_c_facto3_sched1_kway_pqrcpend 543/3626 Test #860: shm_example_simple_lap_d_facto0_sched1_not_pqrcpbegin ................... Passed 150.13 sec Start 1055: shm_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpbegin 544/3626 Test #866: shm_example_simple_lap_d_facto0_sched1_not_rqrcpbegin ................... Passed 144.20 sec Start 1056: shm_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpend 545/3626 Test #871: shm_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpend ......... Passed 145.50 sec Start 1057: shm_example_simple_lap_c_facto3_sched1_not_rqrcpbegin 546/3626 Test #869: shm_example_simple_lap_d_facto0_sched1_kway_rqrcpend .................... Passed 146.21 sec Start 1058: shm_example_simple_lap_c_facto3_sched1_not_rqrcpend 547/3626 Test #853: shm_example_simple_lap_s_facto2_sched1_kway_pqrcpilu1 ................... Passed 159.28 sec Start 1059: shm_example_simple_lap_c_facto3_sched1_kway_rqrcpbegin 548/3626 Test #868: shm_example_simple_lap_d_facto0_sched1_kway_rqrcpbegin .................. Passed 150.31 sec Start 1060: shm_example_simple_lap_c_facto3_sched1_kway_rqrcpend 549/3626 Test #895: shm_example_simple_lap_d_facto1_sched1_kway_pqrcpend .................... Passed 143.53 sec Start 1061: shm_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpbegin 550/3626 Test #872: shm_example_simple_lap_d_facto0_sched1_not_tqrcpbegin ................... Passed 152.47 sec Start 1062: shm_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpend 551/3626 Test #882: shm_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtbegin ....... Passed 150.50 sec Start 1063: shm_example_simple_lap_c_facto3_sched1_not_tqrcpbegin 552/3626 Test #908: shm_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpbegin ....... Passed 145.44 sec Start 1064: shm_example_simple_lap_c_facto3_sched1_not_tqrcpend 553/3626 Test #906: shm_example_simple_lap_d_facto1_sched1_kway_tqrcpbegin .................. Passed 145.74 sec Start 1065: shm_example_simple_lap_c_facto3_sched1_kway_tqrcpbegin 554/3626 Test #862: shm_example_simple_lap_d_facto0_sched1_kway_pqrcpbegin .................. Passed 163.44 sec Start 1066: shm_example_simple_lap_c_facto3_sched1_kway_tqrcpend 555/3626 Test #881: shm_example_simple_lap_d_facto0_sched1_kway_rqrrtend .................... Passed 154.18 sec Start 1067: shm_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpbegin 556/3626 Test #865: shm_example_simple_lap_d_facto0_sched1_kwayprojections_pqrcpend ......... Passed 161.60 sec Start 1068: shm_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpend 557/3626 Test #910: shm_example_simple_lap_d_facto1_sched1_not_rqrrtbegin ................... Passed 146.33 sec Start 1069: shm_example_simple_lap_c_facto3_sched1_not_rqrrtbegin 558/3626 Test #912: shm_example_simple_lap_d_facto1_sched1_kway_rqrrtbegin .................. Passed 146.75 sec Start 1070: shm_example_simple_lap_c_facto3_sched1_not_rqrrtend 559/3626 Test #833: shm_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpend .........***Timeout 200.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 833: shm_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpend 559/3626 Test #921: shm_example_simple_lap_d_facto2_sched1_kway_svdend ...................... Passed 141.94 sec Start 1071: shm_example_simple_lap_c_facto3_sched1_kway_rqrrtbegin 560/3626 Test #915: shm_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtend ......... Passed 144.57 sec Start 1072: shm_example_simple_lap_c_facto3_sched1_kway_rqrrtend 561/3626 Test #841: shm_example_simple_lap_s_facto2_sched1_not_tqrcpend ..................... Passed 191.31 sec Start 1073: shm_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtbegin 562/3626 Test #917: shm_example_simple_lap_d_facto1_sched1_kway_pqrcpilu1 ................... Passed 143.46 sec Start 1074: shm_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtend 563/3626 Test #836: shm_example_simple_lap_s_facto2_sched1_kway_rqrcpbegin ..................***Timeout 200.04 sec Start 836: shm_example_simple_lap_s_facto2_sched1_kway_rqrcpbegin 563/3626 Test #916: shm_example_simple_lap_d_facto1_sched1_kway_pqrcpilu0 ................... Passed 145.79 sec Start 1075: shm_example_simple_lap_c_facto3_sched1_kway_pqrcpilu0 564/3626 Test #919: shm_example_simple_lap_d_facto2_sched1_not_svdend ....................... Passed 144.04 sec Start 1076: shm_example_simple_lap_c_facto3_sched1_kway_pqrcpilu1 565/3626 Test #920: shm_example_simple_lap_d_facto2_sched1_kway_svdbegin .................... Passed 144.53 sec Start 1077: shm_example_simple_lap_c_facto4_sched1_not_svdbegin 566/3626 Test #839: shm_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpend .........***Timeout 200.05 sec Start 839: shm_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpend 566/3626 Test #858: shm_example_simple_lap_d_facto0_sched1_kwayprojections_svdbegin ......... Passed 174.08 sec Start 1078: shm_example_simple_lap_c_facto4_sched1_not_svdend 567/3626 Test #870: shm_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpbegin ....... Passed 163.07 sec Start 1079: shm_example_simple_lap_c_facto4_sched1_kway_svdbegin 568/3626 Test #918: shm_example_simple_lap_d_facto2_sched1_not_svdbegin ..................... Passed 145.51 sec Start 1080: shm_example_simple_lap_c_facto4_sched1_kway_svdend 569/3626 Test #840: shm_example_simple_lap_s_facto2_sched1_not_tqrcpbegin ...................***Timeout 200.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 840: shm_example_simple_lap_s_facto2_sched1_not_tqrcpbegin 569/3626 Test #854: shm_example_simple_lap_d_facto0_sched1_not_svdbegin ..................... Passed 181.45 sec Start 1081: shm_example_simple_lap_c_facto4_sched1_kwayprojections_svdbegin 570/3626 Test #922: shm_example_simple_lap_d_facto2_sched1_kwayprojections_svdbegin ......... Passed 151.25 sec Start 1082: shm_example_simple_lap_c_facto4_sched1_kwayprojections_svdend 571/3626 Test #845: shm_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpend ......... Passed 187.97 sec Start 1083: shm_example_simple_lap_c_facto4_sched1_not_pqrcpbegin 572/3626 Test #924: shm_example_simple_lap_d_facto2_sched1_not_pqrcpbegin ................... Passed 151.37 sec Start 1084: shm_example_simple_lap_c_facto4_sched1_not_pqrcpend 573/3626 Test #842: shm_example_simple_lap_s_facto2_sched1_kway_tqrcpbegin ..................***Timeout 200.05 sec Start 842: shm_example_simple_lap_s_facto2_sched1_kway_tqrcpbegin 573/3626 Test #905: shm_example_simple_lap_d_facto1_sched1_not_tqrcpend ..................... Passed 161.80 sec Start 1085: shm_example_simple_lap_c_facto4_sched1_kway_pqrcpbegin 574/3626 Test #852: shm_example_simple_lap_s_facto2_sched1_kway_pqrcpilu0 ................... Passed 187.25 sec Start 1086: shm_example_simple_lap_c_facto4_sched1_kway_pqrcpend 575/3626 Test #936: shm_example_simple_lap_d_facto2_sched1_not_tqrcpend ..................... Passed 145.55 sec Start 1087: shm_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpbegin 576/3626 Test #907: shm_example_simple_lap_d_facto1_sched1_kway_tqrcpend .................... Passed 166.38 sec Start 1088: shm_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpend 577/3626 Test #874: shm_example_simple_lap_d_facto0_sched1_kway_tqrcpbegin .................. Passed 175.51 sec Start 1089: shm_example_simple_lap_c_facto4_sched1_not_rqrcpbegin 578/3626 Test #856: shm_example_simple_lap_d_facto0_sched1_kway_svdbegin .................... Passed 189.45 sec Start 1090: shm_example_simple_lap_c_facto4_sched1_not_rqrcpend 579/3626 Test #880: shm_example_simple_lap_d_facto0_sched1_kway_rqrrtbegin .................. Passed 174.04 sec Start 1091: shm_example_simple_lap_c_facto4_sched1_kway_rqrcpbegin 580/3626 Test #861: shm_example_simple_lap_d_facto0_sched1_not_pqrcpend ..................... Passed 183.99 sec Start 1092: shm_example_simple_lap_c_facto4_sched1_kway_rqrcpend 581/3626 Test #925: shm_example_simple_lap_d_facto2_sched1_not_pqrcpend ..................... Passed 158.44 sec Start 1093: shm_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpbegin 582/3626 Test #904: shm_example_simple_lap_d_facto1_sched1_not_tqrcpbegin ................... Passed 168.07 sec Start 1094: shm_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpend 583/3626 Test #929: shm_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpend ......... Passed 157.96 sec Start 1095: shm_example_simple_lap_c_facto4_sched1_not_tqrcpbegin 584/3626 Test #873: shm_example_simple_lap_d_facto0_sched1_not_tqrcpend ..................... Passed 179.63 sec Start 1096: shm_example_simple_lap_c_facto4_sched1_not_tqrcpend 585/3626 Test #890: shm_example_simple_lap_d_facto1_sched1_kwayprojections_svdbegin ......... Passed 174.96 sec Start 1097: shm_example_simple_lap_c_facto4_sched1_kway_tqrcpbegin 586/3626 Test #891: shm_example_simple_lap_d_facto1_sched1_kwayprojections_svdend ........... Passed 174.99 sec Start 1098: shm_example_simple_lap_c_facto4_sched1_kway_tqrcpend 587/3626 Test #902: shm_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpbegin ....... Passed 173.69 sec Start 1099: shm_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpbegin 588/3626 Test #883: shm_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtend ......... Passed 177.49 sec Start 1100: shm_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpend 589/3626 Test #875: shm_example_simple_lap_d_facto0_sched1_kway_tqrcpend .................... Passed 181.48 sec Start 1101: shm_example_simple_lap_c_facto4_sched1_not_rqrrtbegin 590/3626 Test #887: shm_example_simple_lap_d_facto1_sched1_not_svdend ....................... Passed 176.93 sec Start 1102: shm_example_simple_lap_c_facto4_sched1_not_rqrrtend 591/3626 Test #897: shm_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpend ......... Passed 176.60 sec Start 1103: shm_example_simple_lap_c_facto4_sched1_kway_rqrrtbegin 592/3626 Test #879: shm_example_simple_lap_d_facto0_sched1_not_rqrrtend ..................... Passed 180.83 sec Start 1104: shm_example_simple_lap_c_facto4_sched1_kway_rqrrtend 593/3626 Test #899: shm_example_simple_lap_d_facto1_sched1_not_rqrcpend ..................... Passed 176.41 sec Start 1105: shm_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtbegin 594/3626 Test #940: shm_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpend ......... Passed 150.55 sec Start 1106: shm_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtend 595/3626 Test #938: shm_example_simple_lap_d_facto2_sched1_kway_tqrcpend .................... Passed 152.49 sec Start 1107: shm_example_simple_lap_c_facto4_sched1_kway_pqrcpilu0 596/3626 Test #939: shm_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpbegin ....... Passed 152.13 sec Start 1108: shm_example_simple_lap_c_facto4_sched1_kway_pqrcpilu1 597/3626 Test #944: shm_example_simple_lap_d_facto2_sched1_kway_rqrrtend .................... Passed 143.24 sec Start 1109: shm_example_simple_lap_z_facto0_sched1_not_svdbegin 598/3626 Test #888: shm_example_simple_lap_d_facto1_sched1_kway_svdbegin .................... Passed 178.70 sec Start 1110: shm_example_simple_lap_z_facto0_sched1_not_svdend 599/3626 Test #923: shm_example_simple_lap_d_facto2_sched1_kwayprojections_svdend ........... Passed 166.67 sec Start 1111: shm_example_simple_lap_z_facto0_sched1_kway_svdbegin 600/3626 Test #934: shm_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpbegin ....... Passed 157.75 sec Start 1112: shm_example_simple_lap_z_facto0_sched1_kway_svdend 601/3626 Test #886: shm_example_simple_lap_d_facto1_sched1_not_svdbegin ..................... Passed 179.19 sec Start 1113: shm_example_simple_lap_z_facto0_sched1_kwayprojections_svdbegin 602/3626 Test #909: shm_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpend ......... Passed 174.64 sec Start 1114: shm_example_simple_lap_z_facto0_sched1_kwayprojections_svdend 603/3626 Test #942: shm_example_simple_lap_d_facto2_sched1_not_rqrrtend ..................... Passed 146.76 sec Start 1115: shm_example_simple_lap_z_facto0_sched1_not_pqrcpbegin 604/3626 Test #893: shm_example_simple_lap_d_facto1_sched1_not_pqrcpend ..................... Passed 179.49 sec Start 1116: shm_example_simple_lap_z_facto0_sched1_not_pqrcpend 605/3626 Test #850: shm_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtbegin .......***Timeout 200.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 850: shm_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtbegin 605/3626 Test #884: shm_example_simple_lap_d_facto0_sched1_kway_pqrcpilu0 ................... Passed 182.43 sec Start 1117: shm_example_simple_lap_z_facto0_sched1_kway_pqrcpbegin 606/3626 Test #941: shm_example_simple_lap_d_facto2_sched1_not_rqrrtbegin ................... Passed 151.79 sec Start 1118: shm_example_simple_lap_z_facto0_sched1_kway_pqrcpend 607/3626 Test #966: shm_example_simple_lap_c_facto0_sched1_kwayprojections_rqrcpend ......... Passed 131.54 sec Start 1119: shm_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpbegin 608/3626 Test #926: shm_example_simple_lap_d_facto2_sched1_kway_pqrcpbegin .................. Passed 170.33 sec Start 1120: shm_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpend 609/3626 Test #877: shm_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpend ......... Passed 189.10 sec Start 1121: shm_example_simple_lap_z_facto0_sched1_not_rqrcpbegin 610/3626 Test #864: shm_example_simple_lap_d_facto0_sched1_kwayprojections_pqrcpbegin ....... Passed 197.67 sec Start 1122: shm_example_simple_lap_z_facto0_sched1_not_rqrcpend 611/3626 Test #903: shm_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpend ......... Passed 184.91 sec Start 1123: shm_example_simple_lap_z_facto0_sched1_kway_rqrcpbegin 612/3626 Test #974: shm_example_simple_lap_c_facto0_sched1_not_rqrrtend ..................... Passed 129.97 sec Start 1124: shm_example_simple_lap_z_facto0_sched1_kway_rqrcpend 613/3626 Test #930: shm_example_simple_lap_d_facto2_sched1_not_rqrcpbegin ................... Passed 172.39 sec Start 1125: shm_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpbegin 614/3626 Test #863: shm_example_simple_lap_d_facto0_sched1_kway_pqrcpend .................... Passed 199.83 sec Start 1126: shm_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpend 615/3626 Test #892: shm_example_simple_lap_d_facto1_sched1_not_pqrcpbegin ................... Passed 188.18 sec Start 1127: shm_example_simple_lap_z_facto0_sched1_not_tqrcpbegin 616/3626 Test #946: shm_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtend ......... Passed 151.01 sec Start 1128: shm_example_simple_lap_z_facto0_sched1_not_tqrcpend 617/3626 Test #976: shm_example_simple_lap_c_facto0_sched1_kway_rqrrtend .................... Passed 131.17 sec Start 1129: shm_example_simple_lap_z_facto0_sched1_kway_tqrcpbegin 618/3626 Test #935: shm_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpend ......... Passed 167.45 sec Start 1130: shm_example_simple_lap_z_facto0_sched1_kway_tqrcpend 619/3626 Test #955: shm_example_simple_lap_c_facto0_sched1_not_pqrcpbegin ................... Passed 144.31 sec Start 1131: shm_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpbegin 620/3626 Test #967: shm_example_simple_lap_c_facto0_sched1_not_tqrcpbegin ................... Passed 138.03 sec Start 1132: shm_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpend 621/3626 Test #961: shm_example_simple_lap_c_facto0_sched1_not_rqrcpbegin ................... Passed 141.08 sec Start 1133: shm_example_simple_lap_z_facto0_sched1_not_rqrrtbegin 622/3626 Test #867: shm_example_simple_lap_d_facto0_sched1_not_rqrcpend .....................***Timeout 200.05 sec Start 867: shm_example_simple_lap_d_facto0_sched1_not_rqrcpend 622/3626 Test #900: shm_example_simple_lap_d_facto1_sched1_kway_rqrcpbegin .................. Passed 193.38 sec Start 1134: shm_example_simple_lap_z_facto0_sched1_not_rqrrtend 623/3626 Test #896: shm_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpbegin ....... Passed 194.30 sec Start 1135: shm_example_simple_lap_z_facto0_sched1_kway_rqrrtbegin 624/3626 Test #876: shm_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpbegin .......***Timeout 200.04 sec Start 876: shm_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpbegin 624/3626 Test #948: shm_example_simple_lap_d_facto2_sched1_kway_pqrcpilu1 ................... Passed 156.99 sec Start 1136: shm_example_simple_lap_z_facto0_sched1_kway_rqrrtend 625/3626 Test #878: shm_example_simple_lap_d_facto0_sched1_not_rqrrtbegin ...................***Timeout 200.04 sec Start 878: shm_example_simple_lap_d_facto0_sched1_not_rqrrtbegin 625/3626 Test #971: shm_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpbegin ....... Passed 140.16 sec Start 1137: shm_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtbegin 626/3626 Test #954: shm_example_simple_lap_c_facto0_sched1_kwayprojections_svdend ........... Passed 150.06 sec Start 1138: shm_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtend 627/3626 Test #901: shm_example_simple_lap_d_facto1_sched1_kway_rqrcpend .................... Passed 197.40 sec Start 1139: shm_example_simple_lap_z_facto0_sched1_kway_pqrcpilu0 628/3626 Test #885: shm_example_simple_lap_d_facto0_sched1_kway_pqrcpilu1 ...................***Timeout 200.03 sec Start 885: shm_example_simple_lap_d_facto0_sched1_kway_pqrcpilu1 628/3626 Test #889: shm_example_simple_lap_d_facto1_sched1_kway_svdend ......................***Timeout 200.03 sec Start 889: shm_example_simple_lap_d_facto1_sched1_kway_svdend 628/3626 Test #894: shm_example_simple_lap_d_facto1_sched1_kway_pqrcpbegin ..................***Timeout 200.07 sec Start 894: shm_example_simple_lap_d_facto1_sched1_kway_pqrcpbegin 628/3626 Test #898: shm_example_simple_lap_d_facto1_sched1_not_rqrcpbegin ...................***Timeout 200.03 sec Start 898: shm_example_simple_lap_d_facto1_sched1_not_rqrcpbegin 628/3626 Test #951: shm_example_simple_lap_c_facto0_sched1_kway_svdbegin .................... Passed 157.97 sec Start 1140: shm_example_simple_lap_z_facto0_sched1_kway_pqrcpilu1 629/3626 Test #963: shm_example_simple_lap_c_facto0_sched1_kway_rqrcpbegin .................. Passed 150.80 sec Start 1141: shm_example_simple_lap_z_facto1_sched1_not_svdbegin 630/3626 Test #932: shm_example_simple_lap_d_facto2_sched1_kway_rqrcpbegin .................. Passed 181.47 sec Start 1142: shm_example_simple_lap_z_facto1_sched1_not_svdend 631/3626 Test #985: shm_example_simple_lap_c_facto1_sched1_kwayprojections_svdbegin ......... Passed 133.88 sec Start 1143: shm_example_simple_lap_z_facto1_sched1_kway_svdbegin 632/3626 Test #998: shm_example_simple_lap_c_facto1_sched1_kwayprojections_rqrcpend ......... Passed 126.55 sec Start 1144: shm_example_simple_lap_z_facto1_sched1_kway_svdend Test #638: shm_example_simple_lap_z_facto1_sched0_kway_pqrcpbegin .................. Passed 127.44 sec Start 1145: shm_example_simple_lap_z_facto1_sched1_kwayprojections_svdbegin 634/3626 Test #956: shm_example_simple_lap_c_facto0_sched1_not_pqrcpend ..................... Passed 155.59 sec Start 1146: shm_example_simple_lap_z_facto1_sched1_kwayprojections_svdend 635/3626 Test #1000: shm_example_simple_lap_c_facto1_sched1_not_tqrcpend ..................... Passed 127.00 sec Start 1147: shm_example_simple_lap_z_facto1_sched1_not_pqrcpbegin 636/3626 Test #947: shm_example_simple_lap_d_facto2_sched1_kway_pqrcpilu0 ................... Passed 166.56 sec Start 1148: shm_example_simple_lap_z_facto1_sched1_not_pqrcpend 637/3626 Test #978: shm_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtend ......... Passed 143.37 sec Start 1149: shm_example_simple_lap_z_facto1_sched1_kway_pqrcpbegin 638/3626 Test #933: shm_example_simple_lap_d_facto2_sched1_kway_rqrcpend .................... Passed 185.09 sec Start 1150: shm_example_simple_lap_z_facto1_sched1_kway_pqrcpend 639/3626 Test #911: shm_example_simple_lap_d_facto1_sched1_not_rqrrtend .....................***Timeout 200.03 sec Start 911: shm_example_simple_lap_d_facto1_sched1_not_rqrrtend 639/3626 Test #997: shm_example_simple_lap_c_facto1_sched1_kwayprojections_rqrcpbegin ....... Passed 130.51 sec Start 1151: shm_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpbegin 640/3626 Test #913: shm_example_simple_lap_d_facto1_sched1_kway_rqrrtend ....................***Timeout 200.03 sec Start 913: shm_example_simple_lap_d_facto1_sched1_kway_rqrrtend 640/3626 Test #914: shm_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtbegin .......***Timeout 200.02 sec Start 914: shm_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtbegin 640/3626 Test #1013: shm_example_simple_lap_c_facto2_sched1_not_svdbegin ..................... Passed 130.96 sec Start 1152: shm_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpend 641/3626 Test #986: shm_example_simple_lap_c_facto1_sched1_kwayprojections_svdend ........... Passed 144.05 sec Start 1153: shm_example_simple_lap_z_facto1_sched1_not_rqrcpbegin 642/3626 Test #995: shm_example_simple_lap_c_facto1_sched1_kway_rqrcpbegin .................. Passed 138.39 sec Start 1154: shm_example_simple_lap_z_facto1_sched1_not_rqrcpend 643/3626 Test #1011: shm_example_simple_lap_c_facto1_sched1_kway_pqrcpilu0 ................... Passed 133.90 sec Start 1155: shm_example_simple_lap_z_facto1_sched1_kway_rqrcpbegin 644/3626 Test #980: shm_example_simple_lap_c_facto0_sched1_kway_pqrcpilu1 ................... Passed 148.90 sec Start 1156: shm_example_simple_lap_z_facto1_sched1_kway_rqrcpend 645/3626 Test #927: shm_example_simple_lap_d_facto2_sched1_kway_pqrcpend ....................***Timeout 200.05 sec Start 927: shm_example_simple_lap_d_facto2_sched1_kway_pqrcpend 645/3626 Test #993: shm_example_simple_lap_c_facto1_sched1_not_rqrcpbegin ................... Passed 139.75 sec Start 1157: shm_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpbegin 646/3626 Test #928: shm_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpbegin .......***Timeout 200.06 sec Start 928: shm_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpbegin Test #702: shm_example_simple_lap_z_facto3_sched0_kway_pqrcpbegin .................. Passed 133.69 sec Start 1158: shm_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpend 647/3626 Test #983: shm_example_simple_lap_c_facto1_sched1_kway_svdbegin .................... Passed 146.54 sec Start 1159: shm_example_simple_lap_z_facto1_sched1_not_tqrcpbegin 648/3626 Test #1009: shm_example_simple_lap_c_facto1_sched1_kwayprojections_rqrrtbegin ....... Passed 135.54 sec Start 1160: shm_example_simple_lap_z_facto1_sched1_not_tqrcpend Test #656: shm_example_simple_lap_z_facto1_sched0_kway_rqrrtbegin .................. Passed 138.06 sec Start 1161: shm_example_simple_lap_z_facto1_sched1_kway_tqrcpbegin 650/3626 Test #1003: shm_example_simple_lap_c_facto1_sched1_kwayprojections_tqrcpbegin ....... Passed 138.34 sec Start 1162: shm_example_simple_lap_z_facto1_sched1_kway_tqrcpend 651/3626 Test #1020: shm_example_simple_lap_c_facto2_sched1_not_pqrcpend ..................... Passed 134.80 sec Start 1163: shm_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpbegin 652/3626 Test #931: shm_example_simple_lap_d_facto2_sched1_not_rqrcpend .....................***Timeout 200.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 931: shm_example_simple_lap_d_facto2_sched1_not_rqrcpend 652/3626 Test #984: shm_example_simple_lap_c_facto1_sched1_kway_svdend ...................... Passed 148.09 sec Start 1164: shm_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpend 653/3626 Test #1028: shm_example_simple_lap_c_facto2_sched1_kway_rqrcpend .................... Passed 133.55 sec Start 1165: shm_example_simple_lap_z_facto1_sched1_not_rqrrtbegin 654/3626 Test #945: shm_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtbegin ....... Passed 181.13 sec Start 1166: shm_example_simple_lap_z_facto1_sched1_not_rqrrtend 655/3626 Test #960: shm_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpend ......... Passed 166.22 sec Start 1167: shm_example_simple_lap_z_facto1_sched1_kway_rqrrtbegin 656/3626 Test #1007: shm_example_simple_lap_c_facto1_sched1_kway_rqrrtbegin .................. Passed 138.12 sec Start 1168: shm_example_simple_lap_z_facto1_sched1_kway_rqrrtend Test #672: shm_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpbegin ....... Passed 138.86 sec Start 1169: shm_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtbegin 658/3626 Test #969: shm_example_simple_lap_c_facto0_sched1_kway_tqrcpbegin .................. Passed 161.84 sec Start 1170: shm_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtend 659/3626 Test #965: shm_example_simple_lap_c_facto0_sched1_kwayprojections_rqrcpbegin ....... Passed 166.48 sec Start 1171: shm_example_simple_lap_z_facto1_sched1_kway_pqrcpilu0 660/3626 Test #1034: shm_example_simple_lap_c_facto2_sched1_kway_tqrcpend .................... Passed 134.17 sec Start 1172: shm_example_simple_lap_z_facto1_sched1_kway_pqrcpilu1 661/3626 Test #1029: shm_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpbegin ....... Passed 135.85 sec Start 1173: shm_example_simple_lap_z_facto2_sched1_not_svdbegin 662/3626 Test #975: shm_example_simple_lap_c_facto0_sched1_kway_rqrrtbegin .................. Passed 161.02 sec Start 1174: shm_example_simple_lap_z_facto2_sched1_not_svdend 663/3626 Test #989: shm_example_simple_lap_c_facto1_sched1_kway_pqrcpbegin .................. Passed 149.64 sec Start 1175: shm_example_simple_lap_z_facto2_sched1_kway_svdbegin 664/3626 Test #1002: shm_example_simple_lap_c_facto1_sched1_kway_tqrcpend .................... Passed 142.05 sec Start 1176: shm_example_simple_lap_z_facto2_sched1_kway_svdend 665/3626 Test #1036: shm_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpend ......... Passed 135.59 sec Start 1177: shm_example_simple_lap_z_facto2_sched1_kwayprojections_svdbegin 666/3626 Test #1021: shm_example_simple_lap_c_facto2_sched1_kway_pqrcpbegin .................. Passed 138.68 sec Start 1178: shm_example_simple_lap_z_facto2_sched1_kwayprojections_svdend Test #716: shm_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpbegin ....... Passed 139.30 sec Start 1179: shm_example_simple_lap_z_facto2_sched1_not_pqrcpbegin 668/3626 Test #962: shm_example_simple_lap_c_facto0_sched1_not_rqrcpend ..................... Passed 169.27 sec Start 1180: shm_example_simple_lap_z_facto2_sched1_not_pqrcpend 669/3626 Test #1018: shm_example_simple_lap_c_facto2_sched1_kwayprojections_svdend ........... Passed 139.21 sec Start 1181: shm_example_simple_lap_z_facto2_sched1_kway_pqrcpbegin 670/3626 Test #1014: shm_example_simple_lap_c_facto2_sched1_not_svdend ....................... Passed 141.53 sec Start 1182: shm_example_simple_lap_z_facto2_sched1_kway_pqrcpend Test #678: shm_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpbegin ....... Passed 138.41 sec Start 1183: shm_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpbegin 672/3626 Test #1037: shm_example_simple_lap_c_facto2_sched1_not_rqrrtbegin ................... Passed 137.12 sec Start 1184: shm_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpend 673/3626 Test #1017: shm_example_simple_lap_c_facto2_sched1_kwayprojections_svdbegin ......... Passed 142.17 sec Start 1185: shm_example_simple_lap_z_facto2_sched1_not_rqrcpbegin 674/3626 Test #1015: shm_example_simple_lap_c_facto2_sched1_kway_svdbegin .................... Passed 142.46 sec Start 1186: shm_example_simple_lap_z_facto2_sched1_not_rqrcpend Test #722: shm_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtbegin ....... Passed 142.59 sec Start 1187: shm_example_simple_lap_z_facto2_sched1_kway_rqrcpbegin 676/3626 Test #1024: shm_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpend ......... Passed 141.75 sec Start 1188: shm_example_simple_lap_z_facto2_sched1_kway_rqrcpend 677/3626 Test #1027: shm_example_simple_lap_c_facto2_sched1_kway_rqrcpbegin .................. Passed 141.31 sec Start 1189: shm_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpbegin 678/3626 Test #1019: shm_example_simple_lap_c_facto2_sched1_not_pqrcpbegin ................... Passed 142.64 sec Start 1190: shm_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpend 679/3626 Test #937: shm_example_simple_lap_d_facto2_sched1_kway_tqrcpbegin ..................***Timeout 200.06 sec Start 937: shm_example_simple_lap_d_facto2_sched1_kway_tqrcpbegin 679/3626 Test #1025: shm_example_simple_lap_c_facto2_sched1_not_rqrcpbegin ................... Passed 141.86 sec Start 1191: shm_example_simple_lap_z_facto2_sched1_not_tqrcpbegin 680/3626 Test #1010: shm_example_simple_lap_c_facto1_sched1_kwayprojections_rqrrtend ......... Passed 147.64 sec Start 1192: shm_example_simple_lap_z_facto2_sched1_not_tqrcpend 681/3626 Test #1032: shm_example_simple_lap_c_facto2_sched1_not_tqrcpend ..................... Passed 145.94 sec Start 1193: shm_example_simple_lap_z_facto2_sched1_kway_tqrcpbegin 682/3626 Test #968: shm_example_simple_lap_c_facto0_sched1_not_tqrcpend ..................... Passed 175.22 sec Start 1194: shm_example_simple_lap_z_facto2_sched1_kway_tqrcpend 683/3626 Test #991: shm_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpbegin ....... Passed 159.37 sec Start 1195: shm_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpbegin Test #721: shm_example_simple_lap_z_facto3_sched0_kway_rqrrtend .................... Passed 146.90 sec Start 1196: shm_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpend Test #712: shm_example_simple_lap_z_facto3_sched0_not_tqrcpbegin ................... Passed 148.50 sec Start 1197: shm_example_simple_lap_z_facto2_sched1_not_rqrrtbegin Test #727: shm_example_simple_lap_z_facto4_sched0_not_svdend ....................... Passed 152.53 sec Start 1198: shm_example_simple_lap_z_facto2_sched1_not_rqrrtend 687/3626 Test #943: shm_example_simple_lap_d_facto2_sched1_kway_rqrrtbegin ..................***Timeout 200.03 sec Start 943: shm_example_simple_lap_d_facto2_sched1_kway_rqrrtbegin Test #677: shm_example_simple_lap_z_facto2_sched0_kway_rqrcpend .................... Passed 154.59 sec Start 1199: shm_example_simple_lap_z_facto2_sched1_kway_rqrrtbegin 688/3626 Test #952: shm_example_simple_lap_c_facto0_sched1_kway_svdend ...................... Passed 195.36 sec Start 1200: shm_example_simple_lap_z_facto2_sched1_kway_rqrrtend Test #730: shm_example_simple_lap_z_facto4_sched0_kwayprojections_svdbegin ......... Passed 158.66 sec Start 1201: shm_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtbegin 690/3626 Test #981: shm_example_simple_lap_c_facto1_sched1_not_svdbegin ..................... Passed 174.38 sec Start 1202: shm_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtend 691/3626 Test #949: shm_example_simple_lap_c_facto0_sched1_not_svdbegin .....................***Timeout 200.02 sec Start 949: shm_example_simple_lap_c_facto0_sched1_not_svdbegin 691/3626 Test #957: shm_example_simple_lap_c_facto0_sched1_kway_pqrcpbegin .................. Passed 193.29 sec Start 1203: shm_example_simple_lap_z_facto2_sched1_kway_pqrcpilu0 692/3626 Test #950: shm_example_simple_lap_c_facto0_sched1_not_svdend .......................***Timeout 200.03 sec Start 950: shm_example_simple_lap_c_facto0_sched1_not_svdend 692/3626 Test #996: shm_example_simple_lap_c_facto1_sched1_kway_rqrcpend .................... Passed 171.22 sec Start 1204: shm_example_simple_lap_z_facto2_sched1_kway_pqrcpilu1 693/3626 Test #953: shm_example_simple_lap_c_facto0_sched1_kwayprojections_svdbegin ......... Passed 199.40 sec Start 1205: shm_example_simple_lap_z_facto3_sched1_not_svdbegin 694/3626 Test #1012: shm_example_simple_lap_c_facto1_sched1_kway_pqrcpilu1 ................... Passed 167.34 sec Start 1206: shm_example_simple_lap_z_facto3_sched1_not_svdend Test #163: c_shm_example_simple_mixed_lap_z_facto0 ................................. Passed 152.96 sec Start 1207: shm_example_simple_lap_z_facto3_sched1_kway_svdbegin 696/3626 Test #970: shm_example_simple_lap_c_facto0_sched1_kway_tqrcpend .................... Passed 191.72 sec Start 1208: shm_example_simple_lap_z_facto3_sched1_kway_svdend 697/3626 Test #1038: shm_example_simple_lap_c_facto2_sched1_not_rqrrtend ..................... Passed 162.62 sec Start 1209: shm_example_simple_lap_z_facto3_sched1_kwayprojections_svdbegin 698/3626 Test #958: shm_example_simple_lap_c_facto0_sched1_kway_pqrcpend .................... Passed 198.79 sec Start 1210: shm_example_simple_lap_z_facto3_sched1_kwayprojections_svdend Test #162: c_shm_example_simple_mixed_lap_d_refine_bicgstab_sym .................... Passed 153.60 sec Start 1211: shm_example_simple_lap_z_facto3_sched1_not_pqrcpbegin Test #732: shm_example_simple_lap_z_facto4_sched0_not_pqrcpbegin ................... Passed 166.37 sec Start 1212: shm_example_simple_lap_z_facto3_sched1_not_pqrcpend Test #393: shm_example_simple_lap_d_facto1_sched0_not_tqrcpend ..................... Passed 149.47 sec Start 1213: shm_example_simple_lap_z_facto3_sched1_kway_pqrcpbegin 702/3626 Test #1008: shm_example_simple_lap_c_facto1_sched1_kway_rqrrtend .................... Passed 170.56 sec Start 1214: shm_example_simple_lap_z_facto3_sched1_kway_pqrcpend Test #682: shm_example_simple_lap_z_facto2_sched0_kway_tqrcpbegin .................. Passed 168.74 sec Start 1215: shm_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpbegin 704/3626 Test #959: shm_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpbegin .......***Timeout 200.08 sec Start 959: shm_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpbegin Test #215: shm_example_simple_lap_c_facto3_sched4_1d ............................... Passed 154.42 sec Start 1216: shm_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpend 705/3626 Test #994: shm_example_simple_lap_c_facto1_sched1_not_rqrcpend ..................... Passed 175.98 sec Start 1217: shm_example_simple_lap_z_facto3_sched1_not_rqrcpbegin Test #676: shm_example_simple_lap_z_facto2_sched0_kway_rqrcpbegin .................. Passed 171.58 sec Start 1218: shm_example_simple_lap_z_facto3_sched1_not_rqrcpend 707/3626 Test #964: shm_example_simple_lap_c_facto0_sched1_kway_rqrcpend .................... Passed 199.36 sec Start 1219: shm_example_simple_lap_z_facto3_sched1_kway_rqrcpbegin Test #156: c_shm_example_simple_mixed_refine_bicgstab .............................. Passed 156.48 sec Start 1220: shm_example_simple_lap_z_facto3_sched1_kway_rqrcpend 709/3626 Test #979: shm_example_simple_lap_c_facto0_sched1_kway_pqrcpilu0 ................... Passed 190.26 sec Start 1221: shm_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpbegin Test #197: shm_example_simple_lap_c_facto1_sched1_1d ............................... Passed 158.51 sec Start 1222: shm_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpend Test #205: shm_example_simple_lap_z_facto4_sched1_1d ............................... Passed 158.40 sec Start 1223: shm_example_simple_lap_z_facto3_sched1_not_tqrcpbegin Test #416: shm_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpbegin ....... Passed 156.47 sec Start 1224: shm_example_simple_lap_z_facto3_sched1_not_tqrcpend 713/3626 Test #972: shm_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpend .........***Timeout 200.05 sec Start 972: shm_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpend 713/3626 Test #973: shm_example_simple_lap_c_facto0_sched1_not_rqrrtbegin ...................***Timeout 200.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 973: shm_example_simple_lap_c_facto0_sched1_not_rqrrtbegin 713/3626 Test #1030: shm_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpend ......... Passed 175.04 sec Start 1225: shm_example_simple_lap_z_facto3_sched1_kway_tqrcpbegin 714/3626 Test #977: shm_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtbegin .......***Timeout 200.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 977: shm_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtbegin Test #57: c_shm_example_simple_trans_lap_c_facto2 ................................. Passed 166.59 sec Start 1226: shm_example_simple_lap_z_facto3_sched1_kway_tqrcpend Test #679: shm_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpend ......... Passed 177.29 sec Start 1227: shm_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpbegin Test #409: shm_example_simple_lap_d_facto2_sched0_kway_svdend ...................... Passed 161.08 sec Start 1228: shm_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpend Test #460: shm_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpbegin ....... Passed 160.10 sec Start 1229: shm_example_simple_lap_z_facto3_sched1_not_rqrrtbegin Test #724: shm_example_simple_lap_z_facto3_sched0_kway_pqrcpilu0 ................... Passed 182.57 sec Start 1230: shm_example_simple_lap_z_facto3_sched1_not_rqrrtend Test #758: shm_example_simple_lap_s_facto0_sched1_not_svdbegin ..................... Passed 159.51 sec Start 1231: shm_example_simple_lap_z_facto3_sched1_kway_rqrrtbegin Test #32: c_shm_example_simple_lap_z_facto4 ....................................... Passed 171.58 sec Start 1232: shm_example_simple_lap_z_facto3_sched1_kway_rqrrtend 721/3626 Test #1022: shm_example_simple_lap_c_facto2_sched1_kway_pqrcpend .................... Passed 183.54 sec Start 1233: shm_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtbegin 722/3626 Test #1004: shm_example_simple_lap_c_facto1_sched1_kwayprojections_tqrcpend ......... Passed 188.14 sec Start 1234: shm_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtend 723/3626 Test #1006: shm_example_simple_lap_c_facto1_sched1_not_rqrrtend ..................... Passed 186.89 sec Start 1235: shm_example_simple_lap_z_facto3_sched1_kway_pqrcpilu0 724/3626 Test #999: shm_example_simple_lap_c_facto1_sched1_not_tqrcpbegin ................... Passed 188.70 sec Start 1236: shm_example_simple_lap_z_facto3_sched1_kway_pqrcpilu1 Test #214: shm_example_simple_lap_c_facto2_sched4_1d ............................... Passed 170.24 sec Start 1237: shm_example_simple_lap_z_facto4_sched1_not_svdbegin 726/3626 Test #1005: shm_example_simple_lap_c_facto1_sched1_not_rqrrtbegin ................... Passed 187.43 sec Start 1238: shm_example_simple_lap_z_facto4_sched1_not_svdend Test #754: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtbegin ....... Passed 161.53 sec Start 1239: shm_example_simple_lap_z_facto4_sched1_kway_svdbegin Test #747: shm_example_simple_lap_z_facto4_sched0_kway_tqrcpend .................... Passed 163.12 sec Start 1240: shm_example_simple_lap_z_facto4_sched1_kway_svdend Test #763: shm_example_simple_lap_s_facto0_sched1_kwayprojections_svdend ........... Passed 161.99 sec Start 1241: shm_example_simple_lap_z_facto4_sched1_kwayprojections_svdbegin 730/3626 Test #982: shm_example_simple_lap_c_facto1_sched1_not_svdend .......................***Timeout 200.03 sec Start 982: shm_example_simple_lap_c_facto1_sched1_not_svdend Test #759: shm_example_simple_lap_s_facto0_sched1_not_svdend ....................... Passed 162.28 sec Start 1242: shm_example_simple_lap_z_facto4_sched1_kwayprojections_svdend 731/3626 Test #987: shm_example_simple_lap_c_facto1_sched1_not_pqrcpbegin ...................***Timeout 200.10 sec ischedInit: The thread number has been automatically set to 256 Start 987: shm_example_simple_lap_c_facto1_sched1_not_pqrcpbegin 731/3626 Test #988: shm_example_simple_lap_c_facto1_sched1_not_pqrcpend .....................***Timeout 200.06 sec Start 988: shm_example_simple_lap_c_facto1_sched1_not_pqrcpend 731/3626 Test #990: shm_example_simple_lap_c_facto1_sched1_kway_pqrcpend ....................***Timeout 200.08 sec Start 990: shm_example_simple_lap_c_facto1_sched1_kway_pqrcpend 731/3626 Test #1031: shm_example_simple_lap_c_facto2_sched1_not_tqrcpbegin ................... Passed 186.01 sec Start 1243: shm_example_simple_lap_z_facto4_sched1_not_pqrcpbegin Test #757: shm_example_simple_lap_z_facto4_sched0_kway_pqrcpilu1 ................... Passed 164.18 sec Start 1244: shm_example_simple_lap_z_facto4_sched1_not_pqrcpend 733/3626 Test #1033: shm_example_simple_lap_c_facto2_sched1_kway_tqrcpbegin .................. Passed 186.25 sec Start 1245: shm_example_simple_lap_z_facto4_sched1_kway_pqrcpbegin Test #760: shm_example_simple_lap_s_facto0_sched1_kway_svdbegin .................... Passed 164.35 sec Start 1246: shm_example_simple_lap_z_facto4_sched1_kway_pqrcpend Test #172: c_shm_example_simple_mixed_lap_z_refine_gmres_sym ....................... Passed 175.05 sec Start 1247: shm_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpbegin 736/3626 Test #1035: shm_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpbegin ....... Passed 186.50 sec Start 1248: shm_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpend 737/3626 Test #992: shm_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpend .........***Timeout 200.06 sec Start 992: shm_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpend Test #474: shm_example_simple_lap_c_facto1_sched0_kwayprojections_svdbegin ......... Passed 170.85 sec Start 1249: shm_example_simple_lap_z_facto4_sched1_not_rqrcpbegin Test #330: shm_example_simple_lap_s_facto2_sched0_kway_tqrcpbegin .................. Passed 175.09 sec Start 1250: shm_example_simple_lap_z_facto4_sched1_not_rqrcpend Test #200: shm_example_simple_lap_c_facto4_sched1_1d ............................... Passed 177.50 sec Start 1251: shm_example_simple_lap_z_facto4_sched1_kway_rqrcpbegin Test #478: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpbegin .................. Passed 171.37 sec Start 1252: shm_example_simple_lap_z_facto4_sched1_kway_rqrcpend Test #382: shm_example_simple_lap_d_facto1_sched0_kway_pqrcpbegin .................. Passed 173.60 sec Start 1253: shm_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpbegin Test #56: c_shm_example_simple_trans_lap_c_facto1 ................................. Passed 179.56 sec Start 1254: shm_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpend Test #746: shm_example_simple_lap_z_facto4_sched0_kway_tqrcpbegin .................. Passed 169.60 sec Start 1255: shm_example_simple_lap_z_facto4_sched1_not_tqrcpbegin 744/3626 Test #1016: shm_example_simple_lap_c_facto2_sched1_kway_svdend ...................... Passed 194.66 sec Start 1256: shm_example_simple_lap_z_facto4_sched1_not_tqrcpend Test #740: shm_example_simple_lap_z_facto4_sched0_kway_rqrcpbegin .................. Passed 172.50 sec Start 1257: shm_example_simple_lap_z_facto4_sched1_kway_tqrcpbegin 746/3626 Test #1001: shm_example_simple_lap_c_facto1_sched1_kway_tqrcpbegin ..................***Timeout 200.16 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1001: shm_example_simple_lap_c_facto1_sched1_kway_tqrcpbegin Test #157: c_shm_example_simple_mixed_lap_d_facto0 ................................. Passed 183.59 sec Start 1258: shm_example_simple_lap_z_facto4_sched1_kway_tqrcpend Test #711: shm_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpend .........***Timeout 200.04 sec Start 711: shm_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpend Test #735: shm_example_simple_lap_z_facto4_sched0_kway_pqrcpend .................... Passed 178.55 sec Start 1259: shm_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpbegin Test #687: shm_example_simple_lap_z_facto2_sched0_not_rqrrtend .....................***Timeout 200.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 687: shm_example_simple_lap_z_facto2_sched0_not_rqrrtend Test #294: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpbegin ....... Passed 184.38 sec Start 1260: shm_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpend Test #751: shm_example_simple_lap_z_facto4_sched0_not_rqrrtend ..................... Passed 176.12 sec Start 1261: shm_example_simple_lap_z_facto4_sched1_not_rqrrtbegin Test #741: shm_example_simple_lap_z_facto4_sched0_kway_rqrcpend .................... Passed 178.34 sec Start 1262: shm_example_simple_lap_z_facto4_sched1_not_rqrrtend 751/3626 Test #1023: shm_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpbegin .......***Timeout 200.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1023: shm_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpbegin 751/3626 Test #1026: shm_example_simple_lap_c_facto2_sched1_not_rqrcpend .....................***Timeout 200.03 sec Start 1026: shm_example_simple_lap_c_facto2_sched1_not_rqrcpend Test #291: shm_example_simple_lap_s_facto1_sched0_not_rqrcpend ..................... Passed 185.41 sec Start 1263: shm_example_simple_lap_z_facto4_sched1_kway_rqrrtbegin Test #745: shm_example_simple_lap_z_facto4_sched0_not_tqrcpend ..................... Passed 179.17 sec Start 1264: shm_example_simple_lap_z_facto4_sched1_kway_rqrrtend Test #203: shm_example_simple_lap_z_facto2_sched1_1d ............................... Passed 187.75 sec Start 1265: shm_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtbegin Test #326: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrcpbegin ....... Passed 186.74 sec Start 1266: shm_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtend Test #376: shm_example_simple_lap_d_facto1_sched0_kway_svdbegin .................... Passed 184.41 sec Start 1267: shm_example_simple_lap_z_facto4_sched1_kway_pqrcpilu0 Test #473: shm_example_simple_lap_c_facto1_sched0_kway_svdend ...................... Passed 183.31 sec Start 1268: shm_example_simple_lap_z_facto4_sched1_kway_pqrcpilu1 Test #372: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpilu0 ................... Passed 185.87 sec Start 1269: shm_example_simple_lap_s_facto0_sched4_not_svdbegin Test #737: shm_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpend ......... Passed 182.02 sec Start 1270: shm_example_simple_lap_s_facto0_sched4_not_svdend Test #488: shm_example_simple_lap_c_facto1_sched0_not_tqrcpbegin ................... Passed 183.20 sec Start 1271: shm_example_simple_lap_s_facto0_sched4_kway_svdbegin Test #761: shm_example_simple_lap_s_facto0_sched1_kway_svdend ...................... Passed 179.33 sec Start 1272: shm_example_simple_lap_s_facto0_sched4_kway_svdend Test #778: shm_example_simple_lap_s_facto0_sched1_kway_tqrcpbegin .................. Passed 178.59 sec Start 1273: shm_example_simple_lap_s_facto0_sched4_kwayprojections_svdbegin Test #166: c_shm_example_simple_mixed_lap_z_facto3 ................................. Passed 191.49 sec Start 1274: shm_example_simple_lap_s_facto0_sched4_kwayprojections_svdend Test #775: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpend ......... Passed 180.78 sec Start 1275: shm_example_simple_lap_s_facto0_sched4_not_pqrcpbegin Test #771: shm_example_simple_lap_s_facto0_sched1_not_rqrcpend ..................... Passed 180.73 sec Start 1276: shm_example_simple_lap_s_facto0_sched4_not_pqrcpend Test #799: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpend .................... Passed 178.30 sec Start 1277: shm_example_simple_lap_s_facto0_sched4_kway_pqrcpbegin Test #797: shm_example_simple_lap_s_facto1_sched1_not_pqrcpend ..................... Passed 178.95 sec Start 1278: shm_example_simple_lap_s_facto0_sched4_kway_pqrcpend Test #192: shm_example_simple_lap_s_facto2_sched1_1d ............................... Passed 173.69 sec Start 1279: shm_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpbegin Test #217: shm_example_simple_lap_z_facto0_sched4_1d ............................... Passed 174.33 sec Start 1280: shm_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpend Test #445: shm_example_simple_lap_c_facto0_sched0_not_pqrcpend ..................... Passed 189.34 sec Start 1281: shm_example_simple_lap_s_facto0_sched4_not_rqrcpbegin Test #788: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpilu0 ................... Passed 181.69 sec Start 1282: shm_example_simple_lap_s_facto0_sched4_not_rqrcpend Test #821: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpilu1 ................... Passed 174.07 sec Start 1283: shm_example_simple_lap_s_facto0_sched4_kway_rqrcpbegin Test #54: c_shm_example_simple_trans_lap_d_facto2 ................................. Passed 178.33 sec Start 1284: shm_example_simple_lap_s_facto0_sched4_kway_rqrcpend Test #284: shm_example_simple_lap_s_facto1_sched0_not_pqrcpbegin ................... Passed 175.31 sec Start 1285: shm_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpbegin Test #789: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpilu1 ................... Passed 182.37 sec Start 1286: shm_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpend Test #46: c_shm_example_simple_solve_and_refine_lap_z_facto2 ...................... Passed 179.88 sec Start 1287: shm_example_simple_lap_s_facto0_sched4_not_tqrcpbegin Test #753: shm_example_simple_lap_z_facto4_sched0_kway_rqrrtend .................... Passed 186.45 sec Start 1288: shm_example_simple_lap_s_facto0_sched4_not_tqrcpend Test #772: shm_example_simple_lap_s_facto0_sched1_kway_rqrcpbegin .................. Passed 184.72 sec Start 1289: shm_example_simple_lap_s_facto0_sched4_kway_tqrcpbegin Test #17: c_shm_example_simple_lap_s_facto0 ....................................... Passed 181.49 sec Start 1290: shm_example_simple_lap_s_facto0_sched4_kway_tqrcpend Test #466: shm_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtbegin ....... Passed 177.62 sec Start 1291: shm_example_simple_lap_s_facto0_sched4_kwayprojections_tqrcpbegin Test #779: shm_example_simple_lap_s_facto0_sched1_kway_tqrcpend .................... Passed 185.71 sec Start 1292: shm_example_simple_lap_s_facto0_sched4_kwayprojections_tqrcpend Test #62: c_shm_example_simple_trans_lap_z_facto2 .................................***Timeout 203.15 sec Test #155: c_shm_example_simple_mixed_refine_gmres .................................***Timeout 203.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 Test #160: c_shm_example_simple_mixed_lap_d_refine_cg_sym ..........................***Timeout 202.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Test #165: c_shm_example_simple_mixed_lap_z_facto2 .................................***Timeout 202.61 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Test #169: c_shm_example_simple_mixed_lap_z_refine_gmres_her .......................***Timeout 201.92 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 Test #201: shm_example_simple_lap_z_facto0_sched1_1d ...............................***Timeout 201.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Test #216: shm_example_simple_lap_c_facto4_sched4_1d ...............................***Timeout 201.53 sec Test #218: shm_example_simple_lap_z_facto1_sched4_1d ...............................***Timeout 201.31 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Test #219: shm_example_simple_lap_z_facto2_sched4_1d ...............................***Timeout 201.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Test #252: shm_example_simple_lap_s_facto0_sched0_not_pqrcpbegin ...................***Timeout 200.87 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Test #258: shm_example_simple_lap_s_facto0_sched0_not_rqrcpbegin ...................***Timeout 200.71 sec Test #282: shm_example_simple_lap_s_facto1_sched0_kwayprojections_svdbegin .........***Timeout 200.54 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Test #286: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpbegin ..................***Timeout 200.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Test #332: shm_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpbegin .......***Timeout 199.56 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Test #339: shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtend .........***Timeout 199.20 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Test #362: shm_example_simple_lap_d_facto0_sched0_kway_tqrcpbegin ..................***Timeout 199.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Test #363: shm_example_simple_lap_d_facto0_sched0_kway_tqrcpend ....................***Timeout 199.16 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Test #364: shm_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpbegin .......***Timeout 198.76 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Test #375: shm_example_simple_lap_d_facto1_sched0_not_svdend .......................***Timeout 198.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Test #388: shm_example_simple_lap_d_facto1_sched0_kway_rqrcpbegin ..................***Timeout 197.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Test #419: shm_example_simple_lap_d_facto2_sched0_not_rqrcpend .....................***Timeout 197.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.077113e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.106020e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.726585e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.624908e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.360655e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.943044e-02 s Time to initialize coeftab 5.212830e-02 s Time to factorize 1.215252e+00 s ( 8.22 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 3.973386e-02 s Time for refinement 2.549546e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.643681e-16 max(|| b_i - A x_i ||_1) 1.795106e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.255705e-03 (SUCCESS) Test #420: shm_example_simple_lap_d_facto2_sched0_kway_rqrcpbegin ..................***Timeout 197.24 sec Test #426: shm_example_simple_lap_d_facto2_sched0_kway_tqrcpbegin ..................***Timeout 197.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Test #436: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpilu0 ...................***Timeout 196.84 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.395419e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.360898e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.340552e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.915895e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.094377e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.202687e-01 s Time to initialize coeftab 7.161635e-02 s Time to factorize 1.887059e+00 s ( 5.29 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 2.354744e-02 s - iteration 1 : total iteration time 0.0132 s error 5.9009e-15 Time for refinement 3.568261e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.904661e-15 max(|| b_i - A x_i ||_1) 5.905572e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.420859e-03 (SUCCESS) Test #450: shm_example_simple_lap_c_facto0_sched0_not_rqrcpbegin ...................***Timeout 196.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #453: shm_example_simple_lap_c_facto0_sched0_kway_rqrcpend ....................***Timeout 196.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #463: shm_example_simple_lap_c_facto0_sched0_not_rqrrtend .....................***Timeout 196.12 sec Test #476: shm_example_simple_lap_c_facto1_sched0_not_pqrcpbegin ...................***Timeout 195.64 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #484: shm_example_simple_lap_c_facto1_sched0_kway_rqrcpbegin ..................***Timeout 195.50 sec Test #490: shm_example_simple_lap_c_facto1_sched0_kway_tqrcpbegin ..................***Timeout 195.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #492: shm_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpbegin .......***Timeout 195.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #502: shm_example_simple_lap_c_facto2_sched0_not_svdbegin .....................***Timeout 194.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #738: shm_example_simple_lap_z_facto4_sched0_not_rqrcpbegin ...................***Timeout 194.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 738: shm_example_simple_lap_z_facto4_sched0_not_rqrcpbegin Test #739: shm_example_simple_lap_z_facto4_sched0_not_rqrcpend .....................***Timeout 194.66 sec Start 739: shm_example_simple_lap_z_facto4_sched0_not_rqrcpend Test #742: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpbegin .......***Timeout 194.45 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 742: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpbegin Test #743: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpend .........***Timeout 194.23 sec Start 743: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpend Test #748: shm_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpbegin .......***Timeout 193.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 748: shm_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpbegin Test #750: shm_example_simple_lap_z_facto4_sched0_not_rqrrtbegin ...................***Timeout 192.87 sec Start 750: shm_example_simple_lap_z_facto4_sched0_not_rqrrtbegin Test #764: shm_example_simple_lap_s_facto0_sched1_not_pqrcpbegin ...................***Timeout 192.18 sec Start 764: shm_example_simple_lap_s_facto0_sched1_not_pqrcpbegin Test #767: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpend ....................***Timeout 191.74 sec Start 767: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpend Test #769: shm_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpend .........***Timeout 191.60 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 769: shm_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpend Test #773: shm_example_simple_lap_s_facto0_sched1_kway_rqrcpend ....................***Timeout 191.74 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 773: shm_example_simple_lap_s_facto0_sched1_kway_rqrcpend Test #783: shm_example_simple_lap_s_facto0_sched1_not_rqrrtend .....................***Timeout 191.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 783: shm_example_simple_lap_s_facto0_sched1_not_rqrrtend Test #785: shm_example_simple_lap_s_facto0_sched1_kway_rqrrtend ....................***Timeout 191.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 785: shm_example_simple_lap_s_facto0_sched1_kway_rqrrtend Test #762: shm_example_simple_lap_s_facto0_sched1_kwayprojections_svdbegin .........***Timeout 191.66 sec Start 762: shm_example_simple_lap_s_facto0_sched1_kwayprojections_svdbegin Test #777: shm_example_simple_lap_s_facto0_sched1_not_tqrcpend .....................***Timeout 190.97 sec Start 777: shm_example_simple_lap_s_facto0_sched1_not_tqrcpend Test #782: shm_example_simple_lap_s_facto0_sched1_not_rqrrtbegin ...................***Timeout 190.83 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.379846e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.368937e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.747775e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.346270e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.159632e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.744285e-02 s Time to initialize coeftab 1.802266e-01 s Time to factorize 1.597712e+00 s ( 3.17 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 5.013793e-01 s Time for refinement 6.448641e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.062127e-07 max(|| b_i - A x_i ||_1) 2.306862e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.898723e+00 (SUCCESS) Start 782: shm_example_simple_lap_s_facto0_sched1_not_rqrrtbegin Test #784: shm_example_simple_lap_s_facto0_sched1_kway_rqrrtbegin ..................***Timeout 190.81 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 784: shm_example_simple_lap_s_facto0_sched1_kway_rqrrtbegin Test #786: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtbegin .......***Timeout 190.80 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.029558e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.457320e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.697361e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.125054e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.365163e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.876982e-02 s Time to initialize coeftab 1.176402e-01 s Time to factorize 1.185473e+00 s ( 4.27 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 3.296168e-01 s - iteration 1 : total iteration time 1.62 s error 3.1477e-11 Time for refinement 2.614217e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.919695e-08 max(|| b_i - A x_i ||_1) 2.921256e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670747e-01 (SUCCESS) Start 786: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtbegin Test #791: shm_example_simple_lap_s_facto1_sched1_not_svdend .......................***Timeout 189.97 sec Start 791: shm_example_simple_lap_s_facto1_sched1_not_svdend Test #793: shm_example_simple_lap_s_facto1_sched1_kway_svdend ......................***Timeout 189.68 sec Start 793: shm_example_simple_lap_s_facto1_sched1_kway_svdend Test #795: shm_example_simple_lap_s_facto1_sched1_kwayprojections_svdend ...........***Timeout 189.58 sec Start 795: shm_example_simple_lap_s_facto1_sched1_kwayprojections_svdend Test #796: shm_example_simple_lap_s_facto1_sched1_not_pqrcpbegin ...................***Timeout 189.59 sec Start 796: shm_example_simple_lap_s_facto1_sched1_not_pqrcpbegin Test #798: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpbegin ..................***Timeout 189.16 sec Start 798: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpbegin Test #800: shm_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpbegin .......***Timeout 189.03 sec Start 800: shm_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpbegin Test #20: c_shm_example_simple_lap_d_facto0 ....................................... Passed 187.80 sec Test #145: c_shm_example_refinement_lap_c_refine_cg_sym ............................ Passed 184.34 sec Test #295: shm_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpend ......... Passed 183.93 sec Test #381: shm_example_simple_lap_d_facto1_sched0_not_pqrcpend ..................... Passed 183.81 sec Test #465: shm_example_simple_lap_c_facto0_sched0_kway_rqrrtend .................... Passed 183.61 sec Test #808: shm_example_simple_lap_s_facto1_sched1_not_tqrcpbegin ................... Passed 183.40 sec Test #828: shm_example_simple_lap_s_facto2_sched1_not_pqrcpbegin ................... Passed 183.20 sec Test #440: shm_example_simple_lap_c_facto0_sched0_kway_svdbegin .................... Passed 182.86 sec Test #493: shm_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpend .........***Timeout 197.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #494: shm_example_simple_lap_c_facto1_sched0_not_rqrrtbegin ...................***Timeout 197.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #744: shm_example_simple_lap_z_facto4_sched0_not_tqrcpbegin ...................***Timeout 195.90 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 744: shm_example_simple_lap_z_facto4_sched0_not_tqrcpbegin Test #752: shm_example_simple_lap_z_facto4_sched0_kway_rqrrtbegin ..................***Timeout 193.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 752: shm_example_simple_lap_z_facto4_sched0_kway_rqrrtbegin Test #766: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpbegin ..................***Timeout 193.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 766: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpbegin Test #768: shm_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpbegin .......***Timeout 193.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 768: shm_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpbegin Test #776: shm_example_simple_lap_s_facto0_sched1_not_tqrcpbegin ...................***Timeout 193.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 776: shm_example_simple_lap_s_facto0_sched1_not_tqrcpbegin Test #801: shm_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpend .........***Timeout 193.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 801: shm_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpend Test #802: shm_example_simple_lap_s_facto1_sched1_not_rqrcpbegin ...................***Timeout 193.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 802: shm_example_simple_lap_s_facto1_sched1_not_rqrcpbegin Test #770: shm_example_simple_lap_s_facto0_sched1_not_rqrcpbegin ...................***Timeout 192.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 770: shm_example_simple_lap_s_facto0_sched1_not_rqrcpbegin Test #774: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpbegin .......***Timeout 192.63 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 774: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpbegin Test #780: shm_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpbegin .......***Timeout 192.64 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 780: shm_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpbegin Test #781: shm_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpend .........***Timeout 192.82 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 781: shm_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpend Test #787: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtend .........***Timeout 192.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 787: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtend Test #790: shm_example_simple_lap_s_facto1_sched1_not_svdbegin .....................***Timeout 192.12 sec Start 790: shm_example_simple_lap_s_facto1_sched1_not_svdbegin Test #792: shm_example_simple_lap_s_facto1_sched1_kway_svdbegin ....................***Timeout 191.88 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 792: shm_example_simple_lap_s_facto1_sched1_kway_svdbegin Test #794: shm_example_simple_lap_s_facto1_sched1_kwayprojections_svdbegin .........***Timeout 191.64 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 794: shm_example_simple_lap_s_facto1_sched1_kwayprojections_svdbegin Start 1293: shm_example_simple_lap_s_facto0_sched4_not_rqrrtbegin Start 1294: shm_example_simple_lap_s_facto0_sched4_not_rqrrtend Start 1295: shm_example_simple_lap_s_facto0_sched4_kway_rqrrtbegin Start 1296: shm_example_simple_lap_s_facto0_sched4_kway_rqrrtend Start 1297: shm_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtbegin Start 1298: shm_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtend Start 1299: shm_example_simple_lap_s_facto0_sched4_kway_pqrcpilu0 Start 1300: shm_example_simple_lap_s_facto0_sched4_kway_pqrcpilu1 Start 1301: shm_example_simple_lap_s_facto1_sched4_not_svdbegin Start 1302: shm_example_simple_lap_s_facto1_sched4_not_svdend Start 1303: shm_example_simple_lap_s_facto1_sched4_kway_svdbegin Start 1304: shm_example_simple_lap_s_facto1_sched4_kway_svdend Start 1305: shm_example_simple_lap_s_facto1_sched4_kwayprojections_svdbegin Start 1306: shm_example_simple_lap_s_facto1_sched4_kwayprojections_svdend Start 1307: shm_example_simple_lap_s_facto1_sched4_not_pqrcpbegin Start 1308: shm_example_simple_lap_s_facto1_sched4_not_pqrcpend Start 1309: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpbegin Start 1310: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpend Start 1311: shm_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpbegin Start 1312: shm_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpend Start 1313: shm_example_simple_lap_s_facto1_sched4_not_rqrcpbegin Start 1314: shm_example_simple_lap_s_facto1_sched4_not_rqrcpend Start 1315: shm_example_simple_lap_s_facto1_sched4_kway_rqrcpbegin Start 1316: shm_example_simple_lap_s_facto1_sched4_kway_rqrcpend Start 1317: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpbegin Start 1318: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpend Start 1319: shm_example_simple_lap_s_facto1_sched4_not_tqrcpbegin Start 1320: shm_example_simple_lap_s_facto1_sched4_not_tqrcpend Start 1321: shm_example_simple_lap_s_facto1_sched4_kway_tqrcpbegin Start 1322: shm_example_simple_lap_s_facto1_sched4_kway_tqrcpend Start 1323: shm_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpbegin Start 1324: shm_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpend Start 1325: shm_example_simple_lap_s_facto1_sched4_not_rqrrtbegin Start 1326: shm_example_simple_lap_s_facto1_sched4_not_rqrrtend Start 1327: shm_example_simple_lap_s_facto1_sched4_kway_rqrrtbegin Start 1328: shm_example_simple_lap_s_facto1_sched4_kway_rqrrtend Start 1329: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtbegin Start 1330: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtend Start 1331: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpilu0 Start 1332: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpilu1 Start 1333: shm_example_simple_lap_s_facto2_sched4_not_svdbegin Start 1334: shm_example_simple_lap_s_facto2_sched4_not_svdend Test #333: shm_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpend ......... Passed 189.63 sec Test #335: shm_example_simple_lap_s_facto2_sched0_not_rqrrtend ..................... Passed 189.69 sec Test #482: shm_example_simple_lap_c_facto1_sched0_not_rqrcpbegin ................... Passed 189.36 sec Test #191: shm_example_simple_lap_s_facto1_sched1_1d ............................... Passed 190.00 sec Test #206: shm_example_simple_lap_s_facto0_sched4_1d ............................... Passed 189.99 sec Test #30: c_shm_example_simple_lap_z_facto2 ....................................... Passed 193.81 sec Test #325: shm_example_simple_lap_s_facto2_sched0_kway_rqrcpend .................... Passed 189.86 sec Test #812: shm_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpbegin ....... Passed 189.38 sec Test #308: shm_example_simple_lap_s_facto1_sched0_kway_pqrcpilu0 ................... Passed 190.04 sec Test #823: shm_example_simple_lap_s_facto2_sched1_not_svdend ....................... Passed 189.32 sec Test #830: shm_example_simple_lap_s_facto2_sched1_kway_pqrcpbegin .................. Passed 189.32 sec Test #94: c_shm_example_personal_lap_d_facto2 ..................................... Passed 189.23 sec Test #503: shm_example_simple_lap_c_facto2_sched0_not_svdend ....................... Passed 189.83 sec Test #805: shm_example_simple_lap_s_facto1_sched1_kway_rqrcpend .................... Passed 189.03 sec Test #807: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpend ......... Passed 189.78 sec Test #820: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpilu0 ................... Passed 189.62 sec Test #829: shm_example_simple_lap_s_facto2_sched1_not_pqrcpend ..................... Passed 189.59 sec Test #142: c_shm_example_refinement_lap_c_refine_cg_her ............................ Passed 191.00 sec Test #278: shm_example_simple_lap_s_facto1_sched0_not_svdbegin ..................... Passed 190.64 sec Test #472: shm_example_simple_lap_c_facto1_sched0_kway_svdbegin .................... Passed 189.34 sec Test #442: shm_example_simple_lap_c_facto0_sched0_kwayprojections_svdbegin ......... Passed 189.47 sec Test #510: shm_example_simple_lap_c_facto2_sched0_kway_pqrcpbegin .................. Passed 190.15 sec Test #373: shm_example_simple_lap_d_facto0_sched0_kway_pqrcpilu1 ................... Passed 190.68 sec Test #810: shm_example_simple_lap_s_facto1_sched1_kway_tqrcpbegin .................. Passed 191.02 sec Start 1335: shm_example_simple_lap_s_facto2_sched4_kway_svdbegin Start 1336: shm_example_simple_lap_s_facto2_sched4_kway_svdend Start 1337: shm_example_simple_lap_s_facto2_sched4_kwayprojections_svdbegin Start 1338: shm_example_simple_lap_s_facto2_sched4_kwayprojections_svdend Start 1339: shm_example_simple_lap_s_facto2_sched4_not_pqrcpbegin Start 1340: shm_example_simple_lap_s_facto2_sched4_not_pqrcpend Start 1341: shm_example_simple_lap_s_facto2_sched4_kway_pqrcpbegin Start 1342: shm_example_simple_lap_s_facto2_sched4_kway_pqrcpend Start 1343: shm_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpbegin Start 1344: shm_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpend Start 1345: shm_example_simple_lap_s_facto2_sched4_not_rqrcpbegin Start 1346: shm_example_simple_lap_s_facto2_sched4_not_rqrcpend Start 1347: shm_example_simple_lap_s_facto2_sched4_kway_rqrcpbegin Start 1348: shm_example_simple_lap_s_facto2_sched4_kway_rqrcpend Start 1349: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpbegin Start 1350: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpend Start 1351: shm_example_simple_lap_s_facto2_sched4_not_tqrcpbegin Start 1352: shm_example_simple_lap_s_facto2_sched4_not_tqrcpend Start 1353: shm_example_simple_lap_s_facto2_sched4_kway_tqrcpbegin Start 1354: shm_example_simple_lap_s_facto2_sched4_kway_tqrcpend Start 1355: shm_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpbegin Start 1356: shm_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpend Start 1357: shm_example_simple_lap_s_facto2_sched4_not_rqrrtbegin Start 1358: shm_example_simple_lap_s_facto2_sched4_not_rqrrtend Test #131: c_shm_example_step-by-step_single_hb .................................... Passed 195.02 sec Test #489: shm_example_simple_lap_c_facto1_sched0_not_tqrcpend ..................... Passed 193.88 sec Test #130: c_shm_example_step-by-step_single_mm .................................... Passed 195.06 sec Test #415: shm_example_simple_lap_d_facto2_sched0_kway_pqrcpend .................... Passed 194.13 sec Test #813: shm_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpend ......... Passed 193.65 sec Test #447: shm_example_simple_lap_c_facto0_sched0_kway_pqrcpend .................... Passed 194.01 sec Test #161: c_shm_example_simple_mixed_lap_d_refine_gmres_sym ....................... Passed 194.52 sec Test #167: c_shm_example_simple_mixed_lap_z_facto4 ................................. Passed 194.51 sec Test #498: shm_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtbegin ....... Passed 193.89 sec Test #831: shm_example_simple_lap_s_facto2_sched1_kway_pqrcpend .................... Passed 193.45 sec Test #430: shm_example_simple_lap_d_facto2_sched0_not_rqrrtbegin ................... Passed 194.14 sec Test #439: shm_example_simple_lap_c_facto0_sched0_not_svdend ....................... Passed 194.11 sec Test #406: shm_example_simple_lap_d_facto2_sched0_not_svdbegin ..................... Passed 194.20 sec Test #451: shm_example_simple_lap_c_facto0_sched0_not_rqrcpend ..................... Passed 194.02 sec Test #459: shm_example_simple_lap_c_facto0_sched0_kway_tqrcpend .................... Passed 194.01 sec Test #509: shm_example_simple_lap_c_facto2_sched0_not_pqrcpend ..................... Passed 193.83 sec Test #298: shm_example_simple_lap_s_facto1_sched0_kway_tqrcpbegin .................. Passed 194.37 sec Test #316: shm_example_simple_lap_s_facto2_sched0_not_pqrcpbegin ................... Passed 194.32 sec Test #500: shm_example_simple_lap_c_facto1_sched0_kway_pqrcpilu0 ................... Passed 193.91 sec Test #310: shm_example_simple_lap_s_facto2_sched0_not_svdbegin ..................... Passed 194.35 sec Test #819: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtend ......... Passed 193.63 sec Test #221: shm_example_simple_lap_z_facto4_sched4_1d ............................... Passed 194.45 sec Test #147: c_shm_example_refinement_lap_c_refine_bicgstab_sym ...................... Passed 194.84 sec Test #443: shm_example_simple_lap_c_facto0_sched0_kwayprojections_svdend ........... Passed 193.15 sec Test #138: c_shm_example_refinement_lap_s_refine_bicgstab_sym ...................... Passed 194.99 sec Test #470: shm_example_simple_lap_c_facto1_sched0_not_svdbegin ..................... Passed 194.01 sec Test #139: c_shm_example_refinement_lap_d_refine_cg_sym ............................ Passed 194.99 sec Test #374: shm_example_simple_lap_d_facto1_sched0_not_svdbegin ..................... Passed 194.28 sec Test #315: shm_example_simple_lap_s_facto2_sched0_kwayprojections_svdend ........... Passed 194.37 sec Test #311: shm_example_simple_lap_s_facto2_sched0_not_svdend ....................... Passed 194.38 sec Test #207: shm_example_simple_lap_s_facto1_sched4_1d ............................... Passed 194.51 sec Test #809: shm_example_simple_lap_s_facto1_sched1_not_tqrcpend ..................... Passed 193.79 sec Test #505: shm_example_simple_lap_c_facto2_sched0_kway_svdend ...................... Passed 193.92 sec Test #817: shm_example_simple_lap_s_facto1_sched1_kway_rqrrtend .................... Passed 193.70 sec Test #168: c_shm_example_simple_mixed_lap_z_refine_cg_her .......................... Passed 194.60 sec Test #34: c_shm_example_simple_solve_and_refine_lap_s_facto1 ...................... Passed 198.15 sec Test #144: c_shm_example_refinement_lap_c_refine_bicgstab_her ...................... Passed 194.95 sec Test #395: shm_example_simple_lap_d_facto1_sched0_kway_tqrcpend .................... Passed 194.30 sec Test #322: shm_example_simple_lap_s_facto2_sched0_not_rqrcpbegin ................... Passed 194.39 sec Test #814: shm_example_simple_lap_s_facto1_sched1_not_rqrrtbegin ................... Passed 193.77 sec Test #248: shm_example_simple_lap_s_facto0_sched0_kway_svdbegin .................... Passed 194.52 sec Test #418: shm_example_simple_lap_d_facto2_sched0_not_rqrcpbegin ................... Passed 194.26 sec Test #170: c_shm_example_simple_mixed_lap_z_refine_bicgstab_her .................... Passed 194.62 sec 890/3626 Test #1050: shm_example_simple_lap_c_facto3_sched1_kwayprojections_svdend ........... Passed 184.24 sec Test #136: c_shm_example_refinement_lap_s_refine_cg_sym ............................ Passed 195.11 sec Test #164: c_shm_example_simple_mixed_lap_z_facto1 ................................. Passed 194.66 sec Test #212: shm_example_simple_lap_c_facto0_sched4_1d ............................... Passed 194.58 sec Test #148: c_shm_example_refinement_lap_z_refine_cg_her ............................ Passed 194.93 sec Test #141: c_shm_example_refinement_lap_d_refine_bicgstab_sym ...................... Passed 195.05 sec Test #137: c_shm_example_refinement_lap_s_refine_gmres_sym ......................... Passed 195.16 sec Test #199: shm_example_simple_lap_c_facto3_sched1_1d ............................... Passed 194.66 sec Test #804: shm_example_simple_lap_s_facto1_sched1_kway_rqrcpbegin .................. Passed 193.99 sec Start 1359: shm_example_simple_lap_s_facto2_sched4_kway_rqrrtbegin Start 1360: shm_example_simple_lap_s_facto2_sched4_kway_rqrrtend Start 1361: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtbegin Start 1362: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtend Start 1363: shm_example_simple_lap_s_facto2_sched4_kway_pqrcpilu0 Start 1364: shm_example_simple_lap_s_facto2_sched4_kway_pqrcpilu1 Start 1365: shm_example_simple_lap_d_facto0_sched4_not_svdbegin Start 1366: shm_example_simple_lap_d_facto0_sched4_not_svdend Start 1367: shm_example_simple_lap_d_facto0_sched4_kway_svdbegin Start 1368: shm_example_simple_lap_d_facto0_sched4_kway_svdend Start 1369: shm_example_simple_lap_d_facto0_sched4_kwayprojections_svdbegin Start 1370: shm_example_simple_lap_d_facto0_sched4_kwayprojections_svdend Start 1371: shm_example_simple_lap_d_facto0_sched4_not_pqrcpbegin Start 1372: shm_example_simple_lap_d_facto0_sched4_not_pqrcpend Start 1373: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpbegin Start 1374: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpend Start 1375: shm_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpbegin Start 1376: shm_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpend Start 1377: shm_example_simple_lap_d_facto0_sched4_not_rqrcpbegin Start 1378: shm_example_simple_lap_d_facto0_sched4_not_rqrcpend Start 1379: shm_example_simple_lap_d_facto0_sched4_kway_rqrcpbegin Start 1380: shm_example_simple_lap_d_facto0_sched4_kway_rqrcpend Start 1381: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpbegin Start 1382: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpend Start 1383: shm_example_simple_lap_d_facto0_sched4_not_tqrcpbegin Start 1384: shm_example_simple_lap_d_facto0_sched4_not_tqrcpend Start 1385: shm_example_simple_lap_d_facto0_sched4_kway_tqrcpbegin Start 1386: shm_example_simple_lap_d_facto0_sched4_kway_tqrcpend Start 1387: shm_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpbegin Start 1388: shm_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpend Start 1389: shm_example_simple_lap_d_facto0_sched4_not_rqrrtbegin Start 1390: shm_example_simple_lap_d_facto0_sched4_not_rqrrtend Start 1391: shm_example_simple_lap_d_facto0_sched4_kway_rqrrtbegin Start 1392: shm_example_simple_lap_d_facto0_sched4_kway_rqrrtend Start 1393: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtbegin Start 1394: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtend Start 1395: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpilu0 Start 1396: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpilu1 Start 1397: shm_example_simple_lap_d_facto1_sched4_not_svdbegin Start 1398: shm_example_simple_lap_d_facto1_sched4_not_svdend Start 1399: shm_example_simple_lap_d_facto1_sched4_kway_svdbegin Start 1400: shm_example_simple_lap_d_facto1_sched4_kway_svdend Start 1401: shm_example_simple_lap_d_facto1_sched4_kwayprojections_svdbegin Start 1402: shm_example_simple_lap_d_facto1_sched4_kwayprojections_svdend Start 1403: shm_example_simple_lap_d_facto1_sched4_not_pqrcpbegin Start 1404: shm_example_simple_lap_d_facto1_sched4_not_pqrcpend Start 1405: shm_example_simple_lap_d_facto1_sched4_kway_pqrcpbegin Start 1406: shm_example_simple_lap_d_facto1_sched4_kway_pqrcpend Start 1407: shm_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpbegin Start 1408: shm_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpend Start 1409: shm_example_simple_lap_d_facto1_sched4_not_rqrcpbegin Start 1410: shm_example_simple_lap_d_facto1_sched4_not_rqrcpend Test #67: c_shm_example_step-by-step_lap_s_facto2 ................................. Passed 204.71 sec Test #132: c_shm_example_step-by-step_single_mm2 ................................... Passed 202.49 sec Test #66: c_shm_example_step-by-step_lap_s_facto1 ................................. Passed 204.80 sec Test #159: c_shm_example_simple_mixed_lap_d_facto2 ................................. Passed 202.01 sec Test #825: shm_example_simple_lap_s_facto2_sched1_kway_svdend ...................... Passed 200.99 sec Test #811: shm_example_simple_lap_s_facto1_sched1_kway_tqrcpend .................... Passed 201.17 sec 905/3626 Test #1056: shm_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpend ......... Passed 182.66 sec Test #290: shm_example_simple_lap_s_facto1_sched0_not_rqrcpbegin ................... Passed 201.86 sec Test #151: c_shm_example_refinement_lap_z_refine_cg_sym ............................ Passed 202.19 sec 908/3626 Test #1048: shm_example_simple_lap_c_facto3_sched1_kway_svdend ...................... Passed 194.50 sec Test #143: c_shm_example_refinement_lap_c_refine_gmres_her ......................... Passed 202.36 sec Test #818: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtbegin ....... Passed 201.11 sec 911/3626 Test #1044: shm_example_simple_lap_c_facto2_sched1_kway_pqrcpilu1 ................... Passed 200.40 sec 912/3626 Test #1046: shm_example_simple_lap_c_facto3_sched1_not_svdend ....................... Passed 197.02 sec Test #146: c_shm_example_refinement_lap_c_refine_gmres_sym ......................... Passed 202.34 sec Test #150: c_shm_example_refinement_lap_z_refine_bicgstab_her ...................... Passed 202.27 sec 915/3626 Test #1052: shm_example_simple_lap_c_facto3_sched1_not_pqrcpend ..................... Passed 188.01 sec Test #153: c_shm_example_refinement_lap_z_refine_bicgstab_sym ...................... Passed 202.12 sec Test #822: shm_example_simple_lap_s_facto2_sched1_not_svdbegin ..................... Passed 201.10 sec 918/3626 Test #1051: shm_example_simple_lap_c_facto3_sched1_not_pqrcpbegin ................... Passed 189.52 sec Test #65: c_shm_example_step-by-step_lap_s_facto0 ................................. Passed 205.37 sec Test #75: c_shm_example_step-by-step_lap_c_facto4 ................................. Passed 204.49 sec Test #173: c_shm_example_simple_mixed_lap_z_refine_bicgstab_sym .................... Passed 202.48 sec Test #438: shm_example_simple_lap_c_facto0_sched0_not_svdbegin ..................... Passed 202.10 sec Test #806: shm_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpbegin ....... Passed 201.74 sec Test #410: shm_example_simple_lap_d_facto2_sched0_kwayprojections_svdbegin ......... Passed 201.20 sec Test #816: shm_example_simple_lap_s_facto1_sched1_kway_rqrrtbegin .................. Passed 201.12 sec 926/3626 Test #1042: shm_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtend ......... Passed 200.98 sec 927/3626 Test #1054: shm_example_simple_lap_c_facto3_sched1_kway_pqrcpend .................... Passed 185.03 sec Test #68: c_shm_example_step-by-step_lap_d_facto0 .................................***Timeout 206.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.160356e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.495715e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.969762e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.952553e-01 s Time to initialize internal csc 2.987158e-03 s Time to initialize coeftab 3.678546e-02 s Time to factorize 8.669032e-01 s ( 5.84 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 8.119952e-01 s Time for refinement 2.296047e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.199944e-16 max(|| b_i - A x_i ||_1) 2.118412e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.683317e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.332268e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.666611e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.233662e+00 s Time for refinement 1.504834e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.959892e-16 max(|| b_i - A x_i ||_1) 2.067179e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.618423e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.332268e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.666611e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to initialize internal csc 1.284531e-03 s Time to initialize coeftab 1.075546e-01 s Time to factorize 7.390062e-01 s ( 6.85 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 7.393622e-01 s Time for refinement 1.332318e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.039767e-16 max(|| b_i - A x_i ||_1) 2.074042e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.627116e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.332268e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.666611e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 5.826814e-01 s Time for refinement 1.067167e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.995848e-16 max(|| b_i - A x_i ||_1) 2.083882e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.639580e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.221245e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.445041e-03 (SUCCESS) Test #70: c_shm_example_step-by-step_lap_d_facto2 .................................***Timeout 207.52 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.423570e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.335391e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.773868e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.004335e-01 s Time to initialize internal csc 6.413154e-03 s Time to initialize coeftab 7.622735e-02 s Time to factorize 8.947615e-01 s (11.16 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 8.954544e-01 s Time for refinement 2.295018e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.969822e-16 max(|| b_i - A x_i ||_1) 1.876820e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.399830e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.221245e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.444393e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 6.688789e-01 s Time for refinement 1.649263e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.865439e-16 max(|| b_i - A x_i ||_1) 1.848217e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.363255e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.165734e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.334261e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to initialize internal csc 1.492189e-03 s Time to initialize coeftab 4.771422e-02 s Time to factorize 8.591998e-01 s (11.62 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 8.028085e-01 s Time for refinement 1.468399e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.950550e-16 max(|| b_i - A x_i ||_1) 1.879552e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.413939e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.165734e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.334261e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 5.425210e-01 s Time for refinement 1.321936e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.945420e-16 max(|| b_i - A x_i ||_1) 1.871782e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.403959e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.276756e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.561339e-03 (SUCCESS) Test #71: c_shm_example_step-by-step_lap_c_facto0 .................................***Timeout 207.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.080932e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.748735e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.294960e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.719073e-01 s Time to initialize internal csc 3.498788e-02 s Time to initialize coeftab 2.813051e-02 s Time to factorize 1.364429e+00 s (14.86 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.368984e+00 s Time for refinement 4.879553e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.296364e-07 max(|| b_i - A x_i ||_1) 9.471112e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.417263e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 6.302681e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 9.098860e-01 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.320663e+00 s Time for refinement 2.742502e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.286849e-07 max(|| b_i - A x_i ||_1) 9.418903e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.422022e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 7.638935e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.107851e+00 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to initialize internal csc 4.096407e-02 s Time to initialize coeftab 8.138805e-02 s Time to factorize 2.491637e+00 s ( 8.14 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.265247e+00 s Time for refinement 1.996453e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.279482e-07 max(|| b_i - A x_i ||_1) 9.374752e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.386055e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 6.260261e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 8.989819e-01 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.202897e+00 s Time for refinement 1.453457e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.276514e-07 max(|| b_i - A x_i ||_1) 9.363070e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.415733e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 6.912597e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 9.848672e-01 (SUCCESS) Test #72: c_shm_example_step-by-step_lap_c_facto1 .................................***Timeout 207.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.034634e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.946677e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.162638e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.656388e-01 s Time to initialize internal csc 3.777335e-02 s Time to initialize coeftab 5.437254e-02 s Time to factorize 1.116283e+00 s (19.09 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.200445e+00 s Time for refinement 4.850818e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029081e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.139842e-07 max(|| b_i - A x_i ||_1) 9.022735e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.266395e+00 (SUCCESS) max(|| x_i ||_oo) 7.029081e-01 max(|| x0_i - x_i ||_oo) 5.967910e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 8.549203e-01 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.337600e+00 s Time for refinement 2.354744e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029081e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.198048e-07 max(|| b_i - A x_i ||_1) 9.107449e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.295349e+00 (SUCCESS) max(|| x_i ||_oo) 7.029081e-01 max(|| x0_i - x_i ||_oo) 6.469576e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 9.382621e-01 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to initialize internal csc 1.276451e-03 s Time to initialize coeftab 1.045039e-01 s Time to factorize 1.978173e+00 s (10.77 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.240801e+00 s Time for refinement 1.638500e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.208305e-07 max(|| b_i - A x_i ||_1) 9.033423e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.293980e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 6.143906e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 8.740697e-01 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.661098e+00 s Time for refinement 1.335465e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029081e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.191600e-07 max(|| b_i - A x_i ||_1) 9.121933e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.290332e+00 (SUCCESS) max(|| x_i ||_oo) 7.029081e-01 max(|| x0_i - x_i ||_oo) 7.047793e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.022119e+00 (SUCCESS) Test #73: c_shm_example_step-by-step_lap_c_facto2 .................................***Timeout 207.55 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.955111e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.194725e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.531242e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.562574e-01 s Time to initialize internal csc 4.409030e-02 s Time to initialize coeftab 4.123259e-02 s Time to factorize 1.739142e+00 s (22.98 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.075000e+00 s Time for refinement 5.191153e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.185349e-07 max(|| b_i - A x_i ||_1) 8.657001e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.191812e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 5.599348e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 8.074417e-01 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.328845e+00 s Time for refinement 1.986306e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.136507e-07 max(|| b_i - A x_i ||_1) 8.749978e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.213163e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 5.693726e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 8.176267e-01 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to initialize internal csc 4.641264e-03 s Time to initialize coeftab 1.849288e-02 s Time to factorize 2.792676e+00 s (14.31 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.401040e+00 s Time for refinement 1.503499e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.168306e-07 max(|| b_i - A x_i ||_1) 8.654408e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.208819e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 6.758297e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 9.801343e-01 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.041224e+00 s Time for refinement 1.110882e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.184400e-07 max(|| b_i - A x_i ||_1) 8.778589e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.234714e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 5.693726e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 8.200677e-01 (SUCCESS) Test #78: c_shm_example_step-by-step_lap_z_facto2 .................................***Timeout 207.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.915554e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.196814e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.457441e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 8.508574e-01 s Time to initialize internal csc 9.167730e-02 s Time to initialize coeftab 2.874962e-02 s Time to factorize 2.064674e+00 s (19.36 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 2.49 Mo Time to solve 1.010097e+00 s Time for refinement 3.355450e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.800424e-16 max(|| b_i - A x_i ||_1) 1.792377e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.613722e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.539371e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.256394e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.141967e+00 s Time for refinement 2.079050e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.797922e-16 max(|| b_i - A x_i ||_1) 1.820271e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.685524e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.584014e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.253516e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to initialize internal csc 6.765066e-02 s Time to initialize coeftab 2.803531e-02 s Time to factorize 2.312558e+00 s (17.28 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 2.49 Mo Time to solve 1.214859e+00 s Time for refinement 1.440337e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.831897e-16 max(|| b_i - A x_i ||_1) 1.803157e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.641470e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.584014e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.253516e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.114598e+00 s Time for refinement 1.240399e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.854462e-16 max(|| b_i - A x_i ||_1) 1.814954e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.671837e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.475229e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.098751e-03 (SUCCESS) Test #79: c_shm_example_step-by-step_lap_z_facto3 .................................***Timeout 207.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.049667e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.874820e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.215656e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 3.183450e-01 s Time to initialize internal csc 7.808253e-02 s Time to initialize coeftab 9.099553e-02 s Time to factorize 1.130478e+00 s (17.94 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.748695e+00 s Time for refinement 4.414209e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.898938e-16 max(|| b_i - A x_i ||_1) 2.007426e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.065417e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.570092e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.254480e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.555844e+00 s Time for refinement 2.018191e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.017678e-16 max(|| b_i - A x_i ||_1) 2.009158e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.069788e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.640164e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.404135e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to initialize internal csc 4.414705e-03 s Time to initialize coeftab 2.394963e-02 s Time to factorize 2.187013e+00 s ( 9.27 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.123282e+00 s Time for refinement 1.635949e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.959842e-16 max(|| b_i - A x_i ||_1) 2.002600e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.053239e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.542371e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.260791e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.157558e+00 s Time for refinement 1.077991e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.012930e-16 max(|| b_i - A x_i ||_1) 2.013127e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.079802e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.474968e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.098380e-03 (SUCCESS) Test #434: shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtbegin .......***Timeout 206.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.118049e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.435660e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.916849e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 5.595288e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.937058e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.039576e-02 s Time to initialize coeftab 8.795166e-01 s Time to factorize 1.107840e+01 s (922.90 KFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 5.345831e-03 s - iteration 1 : total iteration time 0.004 s error 1.008e-12 - iteration 2 : total iteration time 0.00348 s error 4.2236e-18 Time for refinement 1.314317e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.255520e-16 max(|| b_i - A x_i ||_1) 6.250679e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.854516e-04 (SUCCESS) Test #452: shm_example_simple_lap_c_facto0_sched0_kway_rqrcpbegin ..................***Timeout 206.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.746020e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.185651e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.081694e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.084544e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.908199e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.939982e-03 s Time to initialize coeftab 1.042844e+00 s Time to factorize 7.090903e+00 s ( 2.86 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.121314e-02 s - iteration 1 : total iteration time 0.0312 s error 5.2041e-11 Time for refinement 4.660146e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.507199e-08 max(|| b_i - A x_i ||_1) 3.249837e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.200309e-01 (SUCCESS) Test #504: shm_example_simple_lap_c_facto2_sched0_kway_svdbegin ....................***Timeout 205.92 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.387311e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.247283e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.189119e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.374540e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.572392e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.503377e-03 s Time to initialize coeftab 1.412701e+00 s Time to factorize 1.138399e+01 s ( 3.51 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.606573e-02 s Time for refinement 4.660024e-03 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.157403e-07 max(|| b_i - A x_i ||_1) 9.287170e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.343431e+00 (SUCCESS) Test #815: shm_example_simple_lap_s_facto1_sched1_not_rqrrtend .....................***Timeout 205.71 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.857240e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.628082e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.178693e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.501408e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.053091e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.119514e-03 s Time to initialize coeftab 4.431087e-02 s Time to factorize 1.437716e+00 s ( 3.64 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 8.119291e-01 s Time for refinement 4.795145e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.894392e-07 max(|| b_i - A x_i ||_1) 8.184715e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.028463e+00 (SUCCESS) Start 815: shm_example_simple_lap_s_facto1_sched1_not_rqrrtend Test #824: shm_example_simple_lap_s_facto2_sched1_kway_svdbegin ....................***Timeout 205.76 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.765313e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.623843e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.778667e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.362700e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.591704e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.213787e-02 s Time to initialize coeftab 7.965097e-01 s Time to factorize 9.848016e+00 s ( 1.01 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 6.358278e-01 s Time for refinement 3.871086e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.164814e-07 max(|| b_i - A x_i ||_1) 9.341328e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.173799e+00 (SUCCESS) Start 824: shm_example_simple_lap_s_facto2_sched1_kway_svdbegin Test #826: shm_example_simple_lap_s_facto2_sched1_kwayprojections_svdbegin .........***Timeout 205.92 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.488760e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.841596e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.243865e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.421047e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.117129e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.017988e-02 s Time to initialize coeftab 6.733140e-01 s Time to factorize 9.289813e+00 s ( 1.07 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 1.062587e+00 s Time for refinement 3.802907e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.165756e-07 max(|| b_i - A x_i ||_1) 9.322464e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.171428e+00 (SUCCESS) Start 826: shm_example_simple_lap_s_facto2_sched1_kwayprojections_svdbegin Test #827: shm_example_simple_lap_s_facto2_sched1_kwayprojections_svdend ...........***Timeout 206.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.081131e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.912139e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.643527e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.558220e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.979412e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.496017e-02 s Time to initialize coeftab 4.617866e-02 s Time to factorize 2.927081e+00 s ( 3.41 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 1.720190e+00 s Time for refinement 7.474097e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.922028e-07 max(|| b_i - A x_i ||_1) 8.222240e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.033178e+00 (SUCCESS) Start 827: shm_example_simple_lap_s_facto2_sched1_kwayprojections_svdend 938/3626 Test #1040: shm_example_simple_lap_c_facto2_sched1_kway_rqrrtend ....................***Timeout 206.55 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.311336e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.827600e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.082271e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.418597e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.622573e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.874766e-02 s Time to initialize coeftab 2.769614e-01 s Time to factorize 6.345353e+00 s ( 6.30 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.161438e+00 s Time for refinement 2.855257e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.017027e-07 max(|| b_i - A x_i ||_1) 8.578198e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.164536e+00 (SUCCESS) Start 1040: shm_example_simple_lap_c_facto2_sched1_kway_rqrrtend 938/3626 Test #1043: shm_example_simple_lap_c_facto2_sched1_kway_pqrcpilu0 ...................***Timeout 206.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.850537e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.872521e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.997114e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.008160e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.114333e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.741955e-02 s Time to initialize coeftab 1.655300e-01 s Time to factorize 5.660275e+00 s ( 7.06 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 7.456032e-01 s Time for refinement 2.842984e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.032226e-07 max(|| b_i - A x_i ||_1) 1.127369e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.844688e+00 (SUCCESS) Start 1043: shm_example_simple_lap_c_facto2_sched1_kway_pqrcpilu0 Start 1411: shm_example_simple_lap_d_facto1_sched4_kway_rqrcpbegin Start 1412: shm_example_simple_lap_d_facto1_sched4_kway_rqrcpend Start 1413: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpbegin Start 1414: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpend Start 1415: shm_example_simple_lap_d_facto1_sched4_not_tqrcpbegin Start 1416: shm_example_simple_lap_d_facto1_sched4_not_tqrcpend Start 1417: shm_example_simple_lap_d_facto1_sched4_kway_tqrcpbegin Start 1418: shm_example_simple_lap_d_facto1_sched4_kway_tqrcpend Start 1419: shm_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpbegin Start 1420: shm_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpend Start 1421: shm_example_simple_lap_d_facto1_sched4_not_rqrrtbegin Start 1422: shm_example_simple_lap_d_facto1_sched4_not_rqrrtend Start 1423: shm_example_simple_lap_d_facto1_sched4_kway_rqrrtbegin Start 1424: shm_example_simple_lap_d_facto1_sched4_kway_rqrrtend Start 1425: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtbegin Start 1426: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtend Start 1427: shm_example_simple_lap_d_facto1_sched4_kway_pqrcpilu0 Start 1428: shm_example_simple_lap_d_facto1_sched4_kway_pqrcpilu1 Start 1429: shm_example_simple_lap_d_facto2_sched4_not_svdbegin Start 1430: shm_example_simple_lap_d_facto2_sched4_not_svdend Start 1431: shm_example_simple_lap_d_facto2_sched4_kway_svdbegin Start 1432: shm_example_simple_lap_d_facto2_sched4_kway_svdend Start 1433: shm_example_simple_lap_d_facto2_sched4_kwayprojections_svdbegin Start 1434: shm_example_simple_lap_d_facto2_sched4_kwayprojections_svdend Start 1435: shm_example_simple_lap_d_facto2_sched4_not_pqrcpbegin Start 1436: shm_example_simple_lap_d_facto2_sched4_not_pqrcpend Start 1437: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpbegin Start 1438: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpend Start 1439: shm_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpbegin Start 1440: shm_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpend Start 1441: shm_example_simple_lap_d_facto2_sched4_not_rqrcpbegin Start 1442: shm_example_simple_lap_d_facto2_sched4_not_rqrcpend Start 1443: shm_example_simple_lap_d_facto2_sched4_kway_rqrcpbegin Start 1444: shm_example_simple_lap_d_facto2_sched4_kway_rqrcpend Start 1445: shm_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpbegin Start 1446: shm_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpend Start 1447: shm_example_simple_lap_d_facto2_sched4_not_tqrcpbegin Start 1448: shm_example_simple_lap_d_facto2_sched4_not_tqrcpend Start 1449: shm_example_simple_lap_d_facto2_sched4_kway_tqrcpbegin Test #839: shm_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpend ......... Passed 167.41 sec 939/3626 Test #1060: shm_example_simple_lap_c_facto3_sched1_kway_rqrcpend .................... Passed 180.26 sec 940/3626 Test #1074: shm_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtend ......... Passed 169.25 sec 941/3626 Test #1058: shm_example_simple_lap_c_facto3_sched1_not_rqrcpend ..................... Passed 184.12 sec 942/3626 Test #1075: shm_example_simple_lap_c_facto3_sched1_kway_pqrcpilu0 ................... Passed 168.14 sec 943/3626 Test #1080: shm_example_simple_lap_c_facto4_sched1_kway_svdend ...................... Passed 166.57 sec 944/3626 Test #1055: shm_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpbegin ....... Passed 189.48 sec Test #47: c_shm_example_simple_solve_and_refine_lap_z_facto3 ......................***Timeout 211.96 sec Test #49: c_shm_example_simple_trans_lap_s_facto0 .................................***Timeout 211.87 sec Test #69: c_shm_example_step-by-step_lap_d_facto1 .................................***Timeout 211.33 sec Test #74: c_shm_example_step-by-step_lap_c_facto3 .................................***Timeout 210.97 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.751710e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.875609e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.146974e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 6.989037e-01 s Time to initialize internal csc 8.522979e-02 s Time to initialize coeftab 1.043657e-01 s Time to factorize 2.142231e+00 s ( 9.47 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 638 Ko Time to solve 1.852479e+00 s Time for refinement 1.860959e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.297063e-07 max(|| b_i - A x_i ||_1) 9.404213e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.426348e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 6.945044e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 9.894902e-01 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.041795e+00 s Time for refinement 1.292641e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.261097e-07 max(|| b_i - A x_i ||_1) 9.401232e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.404151e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 7.638935e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.107851e+00 (SUCCESS) Test #76: c_shm_example_step-by-step_lap_z_facto0 .................................***Timeout 210.63 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.722266e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.295258e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.016412e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.064619e-01 s Time to initialize internal csc 3.709472e-01 s Time to initialize coeftab 1.107663e-01 s Time to factorize 1.636166e+00 s (12.40 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.389319e+00 s Time for refinement 1.844991e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.918315e-16 max(|| b_i - A x_i ||_1) 2.009486e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.070615e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.387779e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.981618e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.290657e+00 s Time for refinement 1.700440e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.951154e-16 max(|| b_i - A x_i ||_1) 1.982138e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.982808e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.387779e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.981618e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to initialize internal csc 1.351050e-01 s Time to initialize coeftab 1.048880e-01 s Time to factorize 1.573798e+00 s (12.89 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.115157e+00 s Time for refinement 1.289475e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.978751e-16 max(|| b_i - A x_i ||_1) 2.007213e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.064879e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.448884e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.061270e-03 (SUCCESS) Test #77: c_shm_example_step-by-step_lap_z_facto1 .................................***Timeout 210.61 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.649703e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.373377e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.202679e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.572041e-01 s Time to initialize internal csc 1.251526e-02 s Time to initialize coeftab 7.264538e-02 s Time to factorize 2.122769e+00 s (10.04 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.455485e+00 s Time for refinement 1.866127e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.766485e-16 max(|| b_i - A x_i ||_1) 1.840006e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.723367e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.535363e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.250518e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.227829e+00 s Time for refinement 1.499657e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.797006e-16 max(|| b_i - A x_i ||_1) 1.852292e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.767948e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.577923e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.312903e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to initialize internal csc 1.030994e-01 s Time to initialize coeftab 9.267460e-02 s Time to factorize 1.524311e+00 s (13.98 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.275027e+00 s Time for refinement 1.367556e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.688795e-16 max(|| b_i - A x_i ||_1) 1.846492e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.716232e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.341488e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.966338e-03 (SUCCESS) Test #80: c_shm_example_step-by-step_lap_z_facto4 .................................***Timeout 210.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.367053e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.809666e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.753508e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.931177e-01 s Time to initialize internal csc 2.553774e-01 s Time to initialize coeftab 6.346269e-02 s Time to factorize 1.988263e+00 s (10.72 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 1.321451e+00 s Time for refinement 1.874813e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.815620e-16 max(|| b_i - A x_i ||_1) 1.854540e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.685646e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.452866e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.129595e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.066049e+00 s Time for refinement 1.148010e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.748166e-16 max(|| b_i - A x_i ||_1) 1.846902e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.754074e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.341488e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.966338e-03 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Test #140: c_shm_example_refinement_lap_d_refine_gmres_sym .........................***Timeout 209.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.781324e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.262528e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.585264e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.159699e-01 s Time to initialize internal csc 3.436245e-03 s - iteration 1 : total iteration time 0.405 s error 0.20451 - iteration 2 : total iteration time 0.21 s error 0.05944 - iteration 3 : total iteration time 0.465 s error 0.019007 - iteration 4 : total iteration time 0.564 s error 0.0066596 - iteration 5 : total iteration time 0.453 s error 0.0023054 - iteration 6 : total iteration time 0.51 s error 0.00077935 - iteration 7 : total iteration time 0.449 s error 0.00027759 - iteration 8 : total iteration time 0.586 s error 9.3504e-05 - iteration 9 : total iteration time 0.733 s error 3.0631e-05 - iteration 10 : total iteration time 0.873 s error 1.0017e-05 - iteration 11 : total iteration time 0.524 s error 3.0969e-06 - iteration 12 : total iteration time 0.797 s error 9.333e-07 - iteration 13 : total iteration time 0.862 s error 2.7791e-07 - iteration 14 : total iteration time 0.669 s error 8.2065e-08 - iteration 15 : total iteration time 0.657 s error 2.3931e-08 - iteration 16 : total iteration time 0.841 s error 6.7596e-09 - iteration 17 : total iteration time 0.753 s error 1.8866e-09 - iteration 18 : total iteration time 0.884 s error 5.4042e-10 - iteration 19 : total iteration time 0.795 s error 1.7791e-10 - iteration 20 : total iteration time 1.02 s error 6.5041e-11 - iteration 21 : total iteration time 1.18 s error 2.4325e-11 - iteration 22 : total iteration time 1.16 s 953/3626 Test #1039: shm_example_simple_lap_c_facto2_sched1_kway_rqrrtbegin ..................***Timeout 207.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1039: shm_example_simple_lap_c_facto2_sched1_kway_rqrrtbegin Test #89: c_shm_example_personal_lap_s_facto0 .....................................***Timeout 207.63 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Test #90: c_shm_example_personal_lap_s_facto1 .....................................***Timeout 207.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Test #91: c_shm_example_personal_lap_s_facto2 .....................................***Timeout 207.61 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Test #92: c_shm_example_personal_lap_d_facto0 .....................................***Timeout 207.60 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Test #93: c_shm_example_personal_lap_d_facto1 .....................................***Timeout 207.60 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Test #95: c_shm_example_personal_lap_c_facto0 .....................................***Timeout 207.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #96: c_shm_example_personal_lap_c_facto1 .....................................***Timeout 207.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #97: c_shm_example_personal_lap_c_facto2 .....................................***Timeout 207.56 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #98: c_shm_example_personal_lap_c_facto3 .....................................***Timeout 207.55 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #99: c_shm_example_personal_lap_c_facto4 .....................................***Timeout 207.54 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #100: c_shm_example_personal_lap_z_facto0 .....................................***Timeout 207.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Test #101: c_shm_example_personal_lap_z_facto1 .....................................***Timeout 207.52 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Test #102: c_shm_example_personal_lap_z_facto2 .....................................***Timeout 207.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Test #103: c_shm_example_personal_lap_z_facto3 .....................................***Timeout 207.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Test #104: c_shm_example_personal_lap_z_facto4 .....................................***Timeout 207.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Test #121: c_shm_example_simple_scotch_rsa .........................................***Timeout 207.48 sec RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Test #125: c_shm_example_simple_single_rsa .........................................***Timeout 207.47 sec RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Test #129: c_shm_example_step-by-step_single_rsa ...................................***Timeout 207.46 sec RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Test #133: c_shm_example_simple_refine_cg ..........................................***Timeout 207.45 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Test #149: c_shm_example_refinement_lap_z_refine_gmres_her .........................***Timeout 207.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.628806e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 84938 Fill-in of L 7.401359 Time to compute symbol matrix 6.178474e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.701436e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 169876 Fill-in 14.802719 Number of operations in full-rank: LU 62.91 MFlops Prediction: Model AMD 6180 MKL Time to factorize 2.138097e-03 s Time for mapping/scheduling 1.609020e-01 s Time to initialize internal csc 2.291376e-02 s - iteration 1 : total iteration time 1.01 s error 0.22315 - iteration 2 : total iteration time 0.275 s error 0.071346 - iteration 3 : total iteration time 0.244 s error 0.027924 - iteration 4 : total iteration time 0.35 s error 0.0091303 - iteration 5 : total iteration time 0.416 s error 0.0041632 - iteration 6 : total iteration time 0.451 s error 0.0019584 - iteration 7 : total iteration time 0.648 s error 0.0006969 - iteration 8 : total iteration time 0.603 s error 0.00021478 - iteration 9 : total iteration time 0.734 s error 9.1664e-05 - iteration 10 : total iteration time 0.729 s error 3.5384e-05 - iteration 11 : total iteration time 0.742 s error 1.5972e-05 - iteration 12 : total iteration time 0.786 s error 6.3906e-06 - iteration 13 : total iteration time 0.827 s error 1.9635e-06 - iteration 14 : total iteration time 0.866 s error 7.3819e-07 - iteration 15 : total iteration time 0.733 s error 2.9725e-07 - iteration 16 : total iteration time 0.935 s error 1.21e-07 - iteration 17 : total iteration time 0.893 s error 5.1897e-08 - iteration 18 : total iteration time 0.921 s error 1.9171e-08 - iteration 19 : total iteration time 0.921 s error 6.9474e-09 - iteration 20 : total iteration time 0.951 s error 2.784e-09 - iteration 21 : total iteration time 1.04 s error 1.2331e-09 - iteration 22 : total iteration time 1.11 s Test #152: c_shm_example_refinement_lap_z_refine_gmres_sym .........................***Timeout 207.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Test #154: c_shm_example_simple_mixed_refine_cg ....................................***Timeout 207.42 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Test #506: shm_example_simple_lap_c_facto2_sched0_kwayprojections_svdbegin .........***Timeout 207.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 976/3626 Test #1041: shm_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtbegin .......***Timeout 207.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1041: shm_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtbegin 976/3626 Test #1062: shm_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpend ......... Passed 177.65 sec 977/3626 Test #1076: shm_example_simple_lap_c_facto3_sched1_kway_pqrcpilu1 ................... Passed 168.14 sec 978/3626 Test #1045: shm_example_simple_lap_c_facto3_sched1_not_svdbegin .....................***Timeout 205.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.620532e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.360156e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.461098e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.477373e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.881641e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.202517e-02 s Time to initialize coeftab 8.753986e-01 s Time to factorize 2.091106e+01 s (993.14 KFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 6.496985e-01 s Time for refinement 3.604340e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.183118e-07 max(|| b_i - A x_i ||_1) 9.892361e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.496138e+00 (SUCCESS) Start 1045: shm_example_simple_lap_c_facto3_sched1_not_svdbegin 978/3626 Test #1047: shm_example_simple_lap_c_facto3_sched1_kway_svdbegin ....................***Timeout 201.64 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.167025e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.382587e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.141235e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.448172e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.492509e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.383090e-02 s Time to initialize coeftab 3.728937e-01 s Time to factorize 2.264745e+01 s (917.00 KFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 6.211256e-01 s Time for refinement 3.899775e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.260140e-07 max(|| b_i - A x_i ||_1) 1.012653e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.555225e+00 (SUCCESS) Start 1047: shm_example_simple_lap_c_facto3_sched1_kway_svdbegin Start 1450: shm_example_simple_lap_d_facto2_sched4_kway_tqrcpend Start 1451: shm_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpbegin Start 1452: shm_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpend Start 1453: shm_example_simple_lap_d_facto2_sched4_not_rqrrtbegin Start 1454: shm_example_simple_lap_d_facto2_sched4_not_rqrrtend Start 1455: shm_example_simple_lap_d_facto2_sched4_kway_rqrrtbegin Start 1456: shm_example_simple_lap_d_facto2_sched4_kway_rqrrtend Start 1457: shm_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtbegin Start 1458: shm_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtend Start 1459: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpilu0 Start 1460: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpilu1 Start 1461: shm_example_simple_lap_c_facto0_sched4_not_svdbegin Start 1462: shm_example_simple_lap_c_facto0_sched4_not_svdend Start 1463: shm_example_simple_lap_c_facto0_sched4_kway_svdbegin Start 1464: shm_example_simple_lap_c_facto0_sched4_kway_svdend Start 1465: shm_example_simple_lap_c_facto0_sched4_kwayprojections_svdbegin Start 1466: shm_example_simple_lap_c_facto0_sched4_kwayprojections_svdend Start 1467: shm_example_simple_lap_c_facto0_sched4_not_pqrcpbegin Start 1468: shm_example_simple_lap_c_facto0_sched4_not_pqrcpend Start 1469: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpbegin Start 1470: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpend Start 1471: shm_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpbegin Start 1472: shm_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpend Start 1473: shm_example_simple_lap_c_facto0_sched4_not_rqrcpbegin Start 1474: shm_example_simple_lap_c_facto0_sched4_not_rqrcpend Start 1475: shm_example_simple_lap_c_facto0_sched4_kway_rqrcpbegin Start 1476: shm_example_simple_lap_c_facto0_sched4_kway_rqrcpend Start 1477: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpbegin Start 1478: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpend Start 1479: shm_example_simple_lap_c_facto0_sched4_not_tqrcpbegin Start 1480: shm_example_simple_lap_c_facto0_sched4_not_tqrcpend Start 1481: shm_example_simple_lap_c_facto0_sched4_kway_tqrcpbegin Start 1482: shm_example_simple_lap_c_facto0_sched4_kway_tqrcpend Start 1483: shm_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpbegin Start 1484: shm_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpend Start 1485: shm_example_simple_lap_c_facto0_sched4_not_rqrrtbegin Start 1486: shm_example_simple_lap_c_facto0_sched4_not_rqrrtend Start 1487: shm_example_simple_lap_c_facto0_sched4_kway_rqrrtbegin Start 1488: shm_example_simple_lap_c_facto0_sched4_kway_rqrrtend Start 1489: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtbegin 978/3626 Test #1064: shm_example_simple_lap_c_facto3_sched1_not_tqrcpend ..................... Passed 176.00 sec Test #836: shm_example_simple_lap_s_facto2_sched1_kway_rqrcpbegin .................. Passed 170.00 sec 980/3626 Test #1090: shm_example_simple_lap_c_facto4_sched1_not_rqrcpend ..................... Passed 153.96 sec 981/3626 Test #1072: shm_example_simple_lap_c_facto3_sched1_kway_rqrrtend .................... Passed 170.85 sec Test #840: shm_example_simple_lap_s_facto2_sched1_not_tqrcpbegin ................... Passed 166.73 sec 983/3626 Test #1070: shm_example_simple_lap_c_facto3_sched1_not_rqrrtend ..................... Passed 171.79 sec 984/3626 Test #1057: shm_example_simple_lap_c_facto3_sched1_not_rqrcpbegin ................... Passed 185.44 sec 985/3626 Test #1063: shm_example_simple_lap_c_facto3_sched1_not_tqrcpbegin ................... Passed 176.46 sec 986/3626 Test #1053: shm_example_simple_lap_c_facto3_sched1_kway_pqrcpbegin .................. Passed 192.96 sec 987/3626 Test #1065: shm_example_simple_lap_c_facto3_sched1_kway_tqrcpbegin .................. Passed 175.93 sec 988/3626 Test #1049: shm_example_simple_lap_c_facto3_sched1_kwayprojections_svdbegin .........***Timeout 202.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1049: shm_example_simple_lap_c_facto3_sched1_kwayprojections_svdbegin Start 1490: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtend Start 1491: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpilu0 Start 1492: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpilu1 Start 1493: shm_example_simple_lap_c_facto1_sched4_not_svdbegin Start 1494: shm_example_simple_lap_c_facto1_sched4_not_svdend Start 1495: shm_example_simple_lap_c_facto1_sched4_kway_svdbegin Start 1496: shm_example_simple_lap_c_facto1_sched4_kway_svdend Start 1497: shm_example_simple_lap_c_facto1_sched4_kwayprojections_svdbegin Start 1498: shm_example_simple_lap_c_facto1_sched4_kwayprojections_svdend Start 1499: shm_example_simple_lap_c_facto1_sched4_not_pqrcpbegin 988/3626 Test #1116: shm_example_simple_lap_z_facto0_sched1_not_pqrcpend ..................... Passed 145.33 sec 989/3626 Test #1120: shm_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpend ......... Passed 142.00 sec Test #833: shm_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpend ......... Passed 171.96 sec 991/3626 Test #1073: shm_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtbegin ....... Passed 171.16 sec 992/3626 Test #1107: shm_example_simple_lap_c_facto4_sched1_kway_pqrcpilu0 ................... Passed 146.98 sec 993/3626 Test #1069: shm_example_simple_lap_c_facto3_sched1_not_rqrrtbegin ................... Passed 173.46 sec Test #842: shm_example_simple_lap_s_facto2_sched1_kway_tqrcpbegin .................. Passed 160.55 sec 995/3626 Test #1083: shm_example_simple_lap_c_facto4_sched1_not_pqrcpbegin ................... Passed 161.90 sec 996/3626 Test #1096: shm_example_simple_lap_c_facto4_sched1_not_tqrcpend ..................... Passed 151.03 sec 997/3626 Test #1084: shm_example_simple_lap_c_facto4_sched1_not_pqrcpend ..................... Passed 161.32 sec Start 1500: shm_example_simple_lap_c_facto1_sched4_not_pqrcpend Start 1501: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpbegin Start 1502: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpend Start 1503: shm_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpbegin Start 1504: shm_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpend Start 1505: shm_example_simple_lap_c_facto1_sched4_not_rqrcpbegin Start 1506: shm_example_simple_lap_c_facto1_sched4_not_rqrcpend Start 1507: shm_example_simple_lap_c_facto1_sched4_kway_rqrcpbegin Start 1508: shm_example_simple_lap_c_facto1_sched4_kway_rqrcpend Start 1509: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpbegin 998/3626 Test #1088: shm_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpend ......... Passed 156.16 sec Start 1510: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpend 999/3626 Test #1067: shm_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpbegin ....... Passed 174.05 sec Start 1511: shm_example_simple_lap_c_facto1_sched4_not_tqrcpbegin 1000/3626 Test #1102: shm_example_simple_lap_c_facto4_sched1_not_rqrrtend ..................... Passed 148.96 sec Start 1512: shm_example_simple_lap_c_facto1_sched4_not_tqrcpend 1001/3626 Test #1115: shm_example_simple_lap_z_facto0_sched1_not_pqrcpbegin ................... Passed 146.89 sec Start 1513: shm_example_simple_lap_c_facto1_sched4_kway_tqrcpbegin 1002/3626 Test #1078: shm_example_simple_lap_c_facto4_sched1_not_svdend ....................... Passed 170.47 sec Start 1514: shm_example_simple_lap_c_facto1_sched4_kway_tqrcpend 1003/3626 Test #1077: shm_example_simple_lap_c_facto4_sched1_not_svdbegin ..................... Passed 171.07 sec Start 1515: shm_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpbegin 1004/3626 Test #1071: shm_example_simple_lap_c_facto3_sched1_kway_rqrrtbegin .................. Passed 173.88 sec Start 1516: shm_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpend 1005/3626 Test #1059: shm_example_simple_lap_c_facto3_sched1_kway_rqrcpbegin .................. Passed 187.50 sec Start 1517: shm_example_simple_lap_c_facto1_sched4_not_rqrrtbegin 1006/3626 Test #1081: shm_example_simple_lap_c_facto4_sched1_kwayprojections_svdbegin ......... Passed 165.31 sec Start 1518: shm_example_simple_lap_c_facto1_sched4_not_rqrrtend 1007/3626 Test #1061: shm_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpbegin ....... Passed 183.95 sec Start 1519: shm_example_simple_lap_c_facto1_sched4_kway_rqrrtbegin 1008/3626 Test #1111: shm_example_simple_lap_z_facto0_sched1_kway_svdbegin .................... Passed 149.27 sec Start 1520: shm_example_simple_lap_c_facto1_sched4_kway_rqrrtend 1009/3626 Test #1125: shm_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpbegin ....... Passed 140.91 sec Start 1521: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtbegin 1010/3626 Test #1091: shm_example_simple_lap_c_facto4_sched1_kway_rqrcpbegin .................. Passed 157.70 sec Start 1522: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtend 1011/3626 Test #1099: shm_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpbegin ....... Passed 153.66 sec Start 1523: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpilu0 1012/3626 Test #1093: shm_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpbegin ....... Passed 158.01 sec Start 1524: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpilu1 1013/3626 Test #1109: shm_example_simple_lap_z_facto0_sched1_not_svdbegin ..................... Passed 150.62 sec Start 1525: shm_example_simple_lap_c_facto2_sched4_not_svdbegin 1014/3626 Test #1089: shm_example_simple_lap_c_facto4_sched1_not_rqrcpbegin ................... Passed 159.35 sec Start 1526: shm_example_simple_lap_c_facto2_sched4_not_svdend 1015/3626 Test #1130: shm_example_simple_lap_z_facto0_sched1_kway_tqrcpend .................... Passed 138.97 sec Start 1527: shm_example_simple_lap_c_facto2_sched4_kway_svdbegin 1016/3626 Test #1105: shm_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtbegin ....... Passed 152.44 sec Start 1528: shm_example_simple_lap_c_facto2_sched4_kway_svdend 1017/3626 Test #1114: shm_example_simple_lap_z_facto0_sched1_kwayprojections_svdend ........... Passed 153.35 sec Start 1529: shm_example_simple_lap_c_facto2_sched4_kwayprojections_svdbegin 1018/3626 Test #1112: shm_example_simple_lap_z_facto0_sched1_kway_svdend ...................... Passed 154.46 sec Start 1530: shm_example_simple_lap_c_facto2_sched4_kwayprojections_svdend 1019/3626 Test #1119: shm_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpbegin ....... Passed 150.71 sec Start 1531: shm_example_simple_lap_c_facto2_sched4_not_pqrcpbegin 1020/3626 Test #1123: shm_example_simple_lap_z_facto0_sched1_kway_rqrcpbegin .................. Passed 147.27 sec Start 1532: shm_example_simple_lap_c_facto2_sched4_not_pqrcpend 1021/3626 Test #1117: shm_example_simple_lap_z_facto0_sched1_kway_pqrcpbegin .................. Passed 152.67 sec Start 1533: shm_example_simple_lap_c_facto2_sched4_kway_pqrcpbegin 1022/3626 Test #1097: shm_example_simple_lap_c_facto4_sched1_kway_tqrcpbegin .................. Passed 159.44 sec Start 1534: shm_example_simple_lap_c_facto2_sched4_kway_pqrcpend 1023/3626 Test #1066: shm_example_simple_lap_c_facto3_sched1_kway_tqrcpend .................... Passed 187.59 sec Start 1535: shm_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpbegin Test #876: shm_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpbegin ....... Passed 144.88 sec Start 1536: shm_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpend 1025/3626 Test #1086: shm_example_simple_lap_c_facto4_sched1_kway_pqrcpend .................... Passed 172.46 sec Start 1537: shm_example_simple_lap_c_facto2_sched4_not_rqrcpbegin 1026/3626 Test #1110: shm_example_simple_lap_z_facto0_sched1_not_svdend ....................... Passed 163.43 sec Start 1538: shm_example_simple_lap_c_facto2_sched4_not_rqrcpend Test #894: shm_example_simple_lap_d_facto1_sched1_kway_pqrcpbegin .................. Passed 143.24 sec Start 1539: shm_example_simple_lap_c_facto2_sched4_kway_rqrcpbegin 1028/3626 Test #1149: shm_example_simple_lap_z_facto1_sched1_kway_pqrcpbegin .................. Passed 141.24 sec Start 1540: shm_example_simple_lap_c_facto2_sched4_kway_rqrcpend 1029/3626 Test #1160: shm_example_simple_lap_z_facto1_sched1_not_tqrcpend ..................... Passed 132.43 sec Start 1541: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpbegin 1030/3626 Test #1132: shm_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpend ......... Passed 157.32 sec Start 1542: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpend 1031/3626 Test #1068: shm_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpend .........***Timeout 200.60 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1068: shm_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpend 1031/3626 Test #1168: shm_example_simple_lap_z_facto1_sched1_kway_rqrrtend .................... Passed 135.83 sec Start 1543: shm_example_simple_lap_c_facto2_sched4_not_tqrcpbegin 1032/3626 Test #1085: shm_example_simple_lap_c_facto4_sched1_kway_pqrcpbegin .................. Passed 189.68 sec Start 1544: shm_example_simple_lap_c_facto2_sched4_not_tqrcpend 1033/3626 Test #1145: shm_example_simple_lap_z_facto1_sched1_kwayprojections_svdbegin ......... Passed 153.15 sec Start 1545: shm_example_simple_lap_c_facto2_sched4_kway_tqrcpbegin 1034/3626 Test #1147: shm_example_simple_lap_z_facto1_sched1_not_pqrcpbegin ................... Passed 152.13 sec Start 1546: shm_example_simple_lap_c_facto2_sched4_kway_tqrcpend 1035/3626 Test #1079: shm_example_simple_lap_c_facto4_sched1_kway_svdbegin ....................***Timeout 200.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1079: shm_example_simple_lap_c_facto4_sched1_kway_svdbegin Test #867: shm_example_simple_lap_d_facto0_sched1_not_rqrcpend ..................... Passed 164.43 sec Start 1547: shm_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpbegin 1036/3626 Test #1127: shm_example_simple_lap_z_facto0_sched1_not_tqrcpbegin ................... Passed 172.01 sec Start 1548: shm_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpend 1037/3626 Test #1186: shm_example_simple_lap_z_facto2_sched1_not_rqrcpend ..................... Passed 138.85 sec Start 1549: shm_example_simple_lap_c_facto2_sched4_not_rqrrtbegin 1038/3626 Test #1142: shm_example_simple_lap_z_facto1_sched1_not_svdend ....................... Passed 159.35 sec Start 1550: shm_example_simple_lap_c_facto2_sched4_not_rqrrtend 1039/3626 Test #1094: shm_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpend ......... Passed 190.11 sec Test #913: shm_example_simple_lap_d_facto1_sched1_kway_rqrrtend .................... Passed 154.97 sec Start 1551: shm_example_simple_lap_c_facto2_sched4_kway_rqrrtbegin Start 1552: shm_example_simple_lap_c_facto2_sched4_kway_rqrrtend 1041/3626 Test #1087: shm_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpbegin ....... Passed 192.63 sec Start 1553: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtbegin 1042/3626 Test #1190: shm_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpend ......... Passed 138.52 sec Start 1554: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtend 1043/3626 Test #1128: shm_example_simple_lap_z_facto0_sched1_not_tqrcpend ..................... Passed 173.99 sec Start 1555: shm_example_simple_lap_c_facto2_sched4_kway_pqrcpilu0 1044/3626 Test #1151: shm_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpbegin ....... Passed 156.62 sec Start 1556: shm_example_simple_lap_c_facto2_sched4_kway_pqrcpilu1 1045/3626 Test #1164: shm_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpend ......... Passed 146.71 sec Start 1557: shm_example_simple_lap_c_facto3_sched4_not_svdbegin 1046/3626 Test #1082: shm_example_simple_lap_c_facto4_sched1_kwayprojections_svdend ........... Passed 199.88 sec Start 1558: shm_example_simple_lap_c_facto3_sched4_not_svdend 1047/3626 Test #1159: shm_example_simple_lap_z_facto1_sched1_not_tqrcpbegin ................... Passed 148.64 sec Start 1559: shm_example_simple_lap_c_facto3_sched4_kway_svdbegin 1048/3626 Test #1140: shm_example_simple_lap_z_facto0_sched1_kway_pqrcpilu1 ................... Passed 162.57 sec Start 1560: shm_example_simple_lap_c_facto3_sched4_kway_svdend 1049/3626 Test #1148: shm_example_simple_lap_z_facto1_sched1_not_pqrcpend ..................... Passed 160.69 sec Start 1561: shm_example_simple_lap_c_facto3_sched4_kwayprojections_svdbegin Test #911: shm_example_simple_lap_d_facto1_sched1_not_rqrrtend ..................... Passed 159.13 sec Start 1562: shm_example_simple_lap_c_facto3_sched4_kwayprojections_svdend 1051/3626 Test #1176: shm_example_simple_lap_z_facto2_sched1_kway_svdend ...................... Passed 146.02 sec Start 1563: shm_example_simple_lap_c_facto3_sched4_not_pqrcpbegin 1052/3626 Test #1108: shm_example_simple_lap_c_facto4_sched1_kway_pqrcpilu1 ................... Passed 186.97 sec Start 1564: shm_example_simple_lap_c_facto3_sched4_not_pqrcpend 1053/3626 Test #1144: shm_example_simple_lap_z_facto1_sched1_kway_svdend ...................... Passed 163.36 sec Start 1565: shm_example_simple_lap_c_facto3_sched4_kway_pqrcpbegin 1054/3626 Test #1192: shm_example_simple_lap_z_facto2_sched1_not_tqrcpend ..................... Passed 139.21 sec Start 1566: shm_example_simple_lap_c_facto3_sched4_kway_pqrcpend 1055/3626 Test #1153: shm_example_simple_lap_z_facto1_sched1_not_rqrcpbegin ................... Passed 153.70 sec Start 1567: shm_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpbegin 1056/3626 Test #1189: shm_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpbegin ....... Passed 142.74 sec Start 1568: shm_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpend 1057/3626 Test #1124: shm_example_simple_lap_z_facto0_sched1_kway_rqrcpend .................... Passed 179.46 sec Start 1569: shm_example_simple_lap_c_facto3_sched4_not_rqrcpbegin Test #885: shm_example_simple_lap_d_facto0_sched1_kway_pqrcpilu1 ................... Passed 167.44 sec Start 1570: shm_example_simple_lap_c_facto3_sched4_not_rqrcpend 1059/3626 Test #1131: shm_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpbegin ....... Passed 175.25 sec Start 1571: shm_example_simple_lap_c_facto3_sched4_kway_rqrcpbegin 1060/3626 Test #1134: shm_example_simple_lap_z_facto0_sched1_not_rqrrtend ..................... Passed 175.13 sec Start 1572: shm_example_simple_lap_c_facto3_sched4_kway_rqrcpend 1061/3626 Test #1137: shm_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtbegin ....... Passed 173.24 sec Start 1573: shm_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpbegin 1062/3626 Test #1158: shm_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpend ......... Passed 155.66 sec Start 1574: shm_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpend 1063/3626 Test #1092: shm_example_simple_lap_c_facto4_sched1_kway_rqrcpend ....................***Timeout 200.05 sec Start 1092: shm_example_simple_lap_c_facto4_sched1_kway_rqrcpend 1063/3626 Test #1113: shm_example_simple_lap_z_facto0_sched1_kwayprojections_svdbegin ......... Passed 192.12 sec Start 1575: shm_example_simple_lap_c_facto3_sched4_not_tqrcpbegin 1064/3626 Test #1157: shm_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpbegin ....... Passed 157.62 sec Start 1576: shm_example_simple_lap_c_facto3_sched4_not_tqrcpend 1065/3626 Test #1095: shm_example_simple_lap_c_facto4_sched1_not_tqrcpbegin ...................***Timeout 200.08 sec Start 1095: shm_example_simple_lap_c_facto4_sched1_not_tqrcpbegin 1065/3626 Test #1170: shm_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtend ......... Passed 156.17 sec Start 1577: shm_example_simple_lap_c_facto3_sched4_kway_tqrcpbegin 1066/3626 Test #1173: shm_example_simple_lap_z_facto2_sched1_not_svdbegin ..................... Passed 155.31 sec Start 1578: shm_example_simple_lap_c_facto3_sched4_kway_tqrcpend 1067/3626 Test #1191: shm_example_simple_lap_z_facto2_sched1_not_tqrcpbegin ................... Passed 150.23 sec Start 1579: shm_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpbegin 1068/3626 Test #1175: shm_example_simple_lap_z_facto2_sched1_kway_svdbegin .................... Passed 154.87 sec Start 1580: shm_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpend 1069/3626 Test #1141: shm_example_simple_lap_z_facto1_sched1_not_svdbegin ..................... Passed 172.64 sec Start 1581: shm_example_simple_lap_c_facto3_sched4_not_rqrrtbegin 1070/3626 Test #1098: shm_example_simple_lap_c_facto4_sched1_kway_tqrcpend ....................***Timeout 200.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1098: shm_example_simple_lap_c_facto4_sched1_kway_tqrcpend 1070/3626 Test #1100: shm_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpend .........***Timeout 200.14 sec Start 1100: shm_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpend 1070/3626 Test #1180: shm_example_simple_lap_z_facto2_sched1_not_pqrcpend ..................... Passed 156.56 sec Start 1582: shm_example_simple_lap_c_facto3_sched4_not_rqrrtend 1071/3626 Test #1150: shm_example_simple_lap_z_facto1_sched1_kway_pqrcpend .................... Passed 170.87 sec Start 1583: shm_example_simple_lap_c_facto3_sched4_kway_rqrrtbegin 1072/3626 Test #1101: shm_example_simple_lap_c_facto4_sched1_not_rqrrtbegin ...................***Timeout 200.27 sec ischedInit: The thread number has been automatically set to 256 Start 1101: shm_example_simple_lap_c_facto4_sched1_not_rqrrtbegin 1072/3626 Test #1182: shm_example_simple_lap_z_facto2_sched1_kway_pqrcpend .................... Passed 156.16 sec Start 1584: shm_example_simple_lap_c_facto3_sched4_kway_rqrrtend 1073/3626 Test #1104: shm_example_simple_lap_c_facto4_sched1_kway_rqrrtend ....................***Timeout 200.07 sec Start 1104: shm_example_simple_lap_c_facto4_sched1_kway_rqrrtend 1073/3626 Test #1184: shm_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpend ......... Passed 155.66 sec Start 1585: shm_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtbegin 1074/3626 Test #1103: shm_example_simple_lap_c_facto4_sched1_kway_rqrrtbegin ..................***Timeout 200.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1103: shm_example_simple_lap_c_facto4_sched1_kway_rqrrtbegin Test #927: shm_example_simple_lap_d_facto2_sched1_kway_pqrcpend .................... Passed 163.82 sec Start 1586: shm_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtend 1075/3626 Test #1166: shm_example_simple_lap_z_facto1_sched1_not_rqrrtend ..................... Passed 161.61 sec Start 1587: shm_example_simple_lap_c_facto3_sched4_kway_pqrcpilu0 1076/3626 Test #1106: shm_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtend .........***Timeout 200.12 sec Start 1106: shm_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtend 1076/3626 Test #1139: shm_example_simple_lap_z_facto0_sched1_kway_pqrcpilu0 ................... Passed 179.75 sec Start 1588: shm_example_simple_lap_c_facto3_sched4_kway_pqrcpilu1 1077/3626 Test #1198: shm_example_simple_lap_z_facto2_sched1_not_rqrrtend ..................... Passed 145.50 sec Start 1589: shm_example_simple_lap_c_facto4_sched4_not_svdbegin 1078/3626 Test #1200: shm_example_simple_lap_z_facto2_sched1_kway_rqrrtend .................... Passed 139.92 sec Start 1590: shm_example_simple_lap_c_facto4_sched4_not_svdend Test #889: shm_example_simple_lap_d_facto1_sched1_kway_svdend ...................... Passed 179.57 sec Start 1591: shm_example_simple_lap_c_facto4_sched4_kway_svdbegin Test #850: shm_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtbegin .......***Timeout 200.31 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 850: shm_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtbegin Test #914: shm_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtbegin ....... Passed 174.34 sec Start 1592: shm_example_simple_lap_c_facto4_sched4_kway_svdend 1081/3626 Test #1135: shm_example_simple_lap_z_facto0_sched1_kway_rqrrtbegin .................. Passed 187.38 sec Start 1593: shm_example_simple_lap_c_facto4_sched4_kwayprojections_svdbegin 1082/3626 Test #1118: shm_example_simple_lap_z_facto0_sched1_kway_pqrcpend ....................***Timeout 200.40 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1118: shm_example_simple_lap_z_facto0_sched1_kway_pqrcpend 1082/3626 Test #1146: shm_example_simple_lap_z_facto1_sched1_kwayprojections_svdend ........... Passed 179.38 sec Start 1594: shm_example_simple_lap_c_facto4_sched4_kwayprojections_svdend Test #937: shm_example_simple_lap_d_facto2_sched1_kway_tqrcpbegin .................. Passed 158.93 sec Start 1595: shm_example_simple_lap_c_facto4_sched4_not_pqrcpbegin 1084/3626 Test #1161: shm_example_simple_lap_z_facto1_sched1_kway_tqrcpbegin .................. Passed 168.16 sec Start 1596: shm_example_simple_lap_c_facto4_sched4_not_pqrcpend 1085/3626 Test #1197: shm_example_simple_lap_z_facto2_sched1_not_rqrrtbegin ................... Passed 151.16 sec Start 1597: shm_example_simple_lap_c_facto4_sched4_kway_pqrcpbegin Test #943: shm_example_simple_lap_d_facto2_sched1_kway_rqrrtbegin .................. Passed 148.56 sec Start 1598: shm_example_simple_lap_c_facto4_sched4_kway_pqrcpend 1087/3626 Test #1196: shm_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpend ......... Passed 151.78 sec Start 1599: shm_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpbegin 1088/3626 Test #1179: shm_example_simple_lap_z_facto2_sched1_not_pqrcpbegin ................... Passed 163.65 sec Start 1600: shm_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpend 1089/3626 Test #1193: shm_example_simple_lap_z_facto2_sched1_kway_tqrcpbegin .................. Passed 154.04 sec Start 1601: shm_example_simple_lap_c_facto4_sched4_not_rqrcpbegin 1090/3626 Test #1199: shm_example_simple_lap_z_facto2_sched1_kway_rqrrtbegin .................. Passed 145.76 sec Start 1602: shm_example_simple_lap_c_facto4_sched4_not_rqrcpend 1091/3626 Test #1177: shm_example_simple_lap_z_facto2_sched1_kwayprojections_svdbegin ......... Passed 163.92 sec 1092/3626 Test #1185: shm_example_simple_lap_z_facto2_sched1_not_rqrcpbegin ................... Passed 161.99 sec Start 1603: shm_example_simple_lap_c_facto4_sched4_kway_rqrcpbegin Start 1604: shm_example_simple_lap_c_facto4_sched4_kway_rqrcpend 1093/3626 Test #1121: shm_example_simple_lap_z_facto0_sched1_not_rqrcpbegin ...................***Timeout 200.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1121: shm_example_simple_lap_z_facto0_sched1_not_rqrcpbegin 1093/3626 Test #1162: shm_example_simple_lap_z_facto1_sched1_kway_tqrcpend .................... Passed 169.74 sec Start 1605: shm_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpbegin 1094/3626 Test #1194: shm_example_simple_lap_z_facto2_sched1_kway_tqrcpend .................... Passed 155.94 sec Start 1606: shm_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpend 1095/3626 Test #1222: shm_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpend ......... Passed 132.52 sec Start 1607: shm_example_simple_lap_c_facto4_sched4_not_tqrcpbegin 1096/3626 Test #1122: shm_example_simple_lap_z_facto0_sched1_not_rqrcpend .....................***Timeout 200.32 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1122: shm_example_simple_lap_z_facto0_sched1_not_rqrcpend 1096/3626 Test #1228: shm_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpend ......... Passed 126.67 sec Start 1608: shm_example_simple_lap_c_facto4_sched4_not_tqrcpend Test #972: shm_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpend ......... Passed 131.80 sec Start 1609: shm_example_simple_lap_c_facto4_sched4_kway_tqrcpbegin 1098/3626 Test #1126: shm_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpend .........***Timeout 200.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.904444e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.494306e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.155219e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.792985e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.209550e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.051449e-02 s Time to initialize coeftab 7.248223e-01 s Time to factorize 3.452247e-01 s (58.75 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 5.181318e-01 s Time for refinement 8.047501e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.002685e-16 max(|| b_i - A x_i ||_1) 2.026540e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.113648e-03 (SUCCESS) Start 1126: shm_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpend 1098/3626 Test #1174: shm_example_simple_lap_z_facto2_sched1_not_svdend ....................... Passed 169.25 sec Start 1610: shm_example_simple_lap_c_facto4_sched4_kway_tqrcpend 1099/3626 Test #1154: shm_example_simple_lap_z_facto1_sched1_not_rqrcpend ..................... Passed 175.57 sec Start 1611: shm_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpbegin 1100/3626 Test #1220: shm_example_simple_lap_z_facto3_sched1_kway_rqrcpend .................... Passed 137.95 sec Start 1612: shm_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpend 1101/3626 Test #1143: shm_example_simple_lap_z_facto1_sched1_kway_svdbegin .................... Passed 186.87 sec Start 1613: shm_example_simple_lap_c_facto4_sched4_not_rqrrtbegin 1102/3626 Test #1172: shm_example_simple_lap_z_facto1_sched1_kway_pqrcpilu1 ................... Passed 171.11 sec Start 1614: shm_example_simple_lap_c_facto4_sched4_not_rqrrtend 1103/3626 Test #1136: shm_example_simple_lap_z_facto0_sched1_kway_rqrrtend .................... Passed 193.87 sec Start 1615: shm_example_simple_lap_c_facto4_sched4_kway_rqrrtbegin 1104/3626 Test #1129: shm_example_simple_lap_z_facto0_sched1_kway_tqrcpbegin .................. Passed 199.71 sec Start 1616: shm_example_simple_lap_c_facto4_sched4_kway_rqrrtend 1105/3626 Test #1215: shm_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpbegin ....... Passed 140.75 sec Start 1617: shm_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtbegin 1106/3626 Test #1133: shm_example_simple_lap_z_facto0_sched1_not_rqrrtbegin ................... Passed 199.15 sec Start 1618: shm_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtend 1107/3626 Test #1155: shm_example_simple_lap_z_facto1_sched1_kway_rqrcpbegin .................. Passed 178.34 sec Start 1619: shm_example_simple_lap_c_facto4_sched4_kway_pqrcpilu0 1108/3626 Test #1152: shm_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpend ......... Passed 181.12 sec Start 1620: shm_example_simple_lap_c_facto4_sched4_kway_pqrcpilu1 1109/3626 Test #1234: shm_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtend ......... Passed 126.23 sec Start 1621: shm_example_simple_lap_z_facto0_sched4_not_svdbegin 1110/3626 Test #1214: shm_example_simple_lap_z_facto3_sched1_kway_pqrcpend .................... Passed 143.24 sec Start 1622: shm_example_simple_lap_z_facto0_sched4_not_svdend 1111/3626 Test #1188: shm_example_simple_lap_z_facto2_sched1_kway_rqrcpend .................... Passed 169.18 sec Start 1623: shm_example_simple_lap_z_facto0_sched4_kway_svdbegin 1112/3626 Test #1218: shm_example_simple_lap_z_facto3_sched1_not_rqrcpend ..................... Passed 143.01 sec Start 1624: shm_example_simple_lap_z_facto0_sched4_kway_svdend 1113/3626 Test #1224: shm_example_simple_lap_z_facto3_sched1_not_tqrcpend ..................... Passed 137.26 sec Start 1625: shm_example_simple_lap_z_facto0_sched4_kwayprojections_svdbegin 1114/3626 Test #1210: shm_example_simple_lap_z_facto3_sched1_kwayprojections_svdend ........... Passed 145.57 sec Start 1626: shm_example_simple_lap_z_facto0_sched4_kwayprojections_svdend Test #898: shm_example_simple_lap_d_facto1_sched1_not_rqrcpbegin ................... Passed 193.04 sec Start 1627: shm_example_simple_lap_z_facto0_sched4_not_pqrcpbegin 1116/3626 Test #1183: shm_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpbegin ....... Passed 173.16 sec Start 1628: shm_example_simple_lap_z_facto0_sched4_not_pqrcpend 1117/3626 Test #1171: shm_example_simple_lap_z_facto1_sched1_kway_pqrcpilu0 ................... Passed 176.75 sec Start 1629: shm_example_simple_lap_z_facto0_sched4_kway_pqrcpbegin Test #987: shm_example_simple_lap_c_facto1_sched1_not_pqrcpbegin ................... Passed 125.72 sec Start 1630: shm_example_simple_lap_z_facto0_sched4_kway_pqrcpend 1119/3626 Test #1221: shm_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpbegin ....... Passed 141.76 sec Start 1631: shm_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpbegin 1120/3626 Test #1167: shm_example_simple_lap_z_facto1_sched1_kway_rqrrtbegin .................. Passed 179.61 sec Start 1632: shm_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpend Test #878: shm_example_simple_lap_d_facto0_sched1_not_rqrrtbegin ...................***Timeout 200.02 sec Start 878: shm_example_simple_lap_d_facto0_sched1_not_rqrrtbegin 1121/3626 Test #1138: shm_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtend .........***Timeout 200.09 sec Start 1138: shm_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtend Test #959: shm_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpbegin ....... Passed 149.41 sec Start 1633: shm_example_simple_lap_z_facto0_sched4_not_rqrcpbegin 1122/3626 Test #1208: shm_example_simple_lap_z_facto3_sched1_kway_svdend ...................... Passed 152.73 sec Start 1634: shm_example_simple_lap_z_facto0_sched4_not_rqrcpend 1123/3626 Test #1204: shm_example_simple_lap_z_facto2_sched1_kway_pqrcpilu1 ................... Passed 155.67 sec Start 1635: shm_example_simple_lap_z_facto0_sched4_kway_rqrcpbegin Test #977: shm_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtbegin ....... Passed 142.23 sec Start 1636: shm_example_simple_lap_z_facto0_sched4_kway_rqrcpend Test #928: shm_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpbegin ....... Passed 189.38 sec Start 1637: shm_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpbegin 1126/3626 Test #1181: shm_example_simple_lap_z_facto2_sched1_kway_pqrcpbegin .................. Passed 183.61 sec Start 1638: shm_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpend 1127/3626 Test #1169: shm_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtbegin ....... Passed 186.50 sec Start 1639: shm_example_simple_lap_z_facto0_sched4_not_tqrcpbegin 1128/3626 Test #1247: shm_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpbegin ....... Passed 134.77 sec Start 1640: shm_example_simple_lap_z_facto0_sched4_not_tqrcpend Test #988: shm_example_simple_lap_c_facto1_sched1_not_pqrcpend ..................... Passed 136.21 sec Start 1641: shm_example_simple_lap_z_facto0_sched4_kway_tqrcpbegin 1130/3626 Test #1202: shm_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtend ......... Passed 164.99 sec Start 1642: shm_example_simple_lap_z_facto0_sched4_kway_tqrcpend Test #931: shm_example_simple_lap_d_facto2_sched1_not_rqrcpend ..................... Passed 191.75 sec Start 1643: shm_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpbegin 1132/3626 Test #1238: shm_example_simple_lap_z_facto4_sched1_not_svdend ....................... Passed 141.86 sec Start 1644: shm_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpend Test #990: shm_example_simple_lap_c_facto1_sched1_kway_pqrcpend .................... Passed 138.49 sec Start 1645: shm_example_simple_lap_z_facto0_sched4_not_rqrrtbegin 1134/3626 Test #1229: shm_example_simple_lap_z_facto3_sched1_not_rqrrtbegin ................... Passed 149.00 sec Start 1646: shm_example_simple_lap_z_facto0_sched4_not_rqrrtend 1135/3626 Test #1219: shm_example_simple_lap_z_facto3_sched1_kway_rqrcpbegin .................. Passed 159.74 sec Start 1647: shm_example_simple_lap_z_facto0_sched4_kway_rqrrtbegin 1136/3626 Test #1201: shm_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtbegin ....... Passed 170.05 sec Start 1648: shm_example_simple_lap_z_facto0_sched4_kway_rqrrtend 1137/3626 Test #1233: shm_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtbegin ....... Passed 146.61 sec Start 1649: shm_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtbegin 1138/3626 Test #1156: shm_example_simple_lap_z_facto1_sched1_kway_rqrcpend .................... Passed 198.86 sec Start 1650: shm_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtend 1139/3626 Test #1252: shm_example_simple_lap_z_facto4_sched1_kway_rqrcpend .................... Passed 139.10 sec Start 1651: shm_example_simple_lap_z_facto0_sched4_kway_pqrcpilu0 1140/3626 Test #1230: shm_example_simple_lap_z_facto3_sched1_not_rqrrtend ..................... Passed 148.48 sec Start 1652: shm_example_simple_lap_z_facto0_sched4_kway_pqrcpilu1 1141/3626 Test #1236: shm_example_simple_lap_z_facto3_sched1_kway_pqrcpilu1 ................... Passed 147.42 sec Start 1653: shm_example_simple_lap_z_facto1_sched4_not_svdbegin 1142/3626 Test #1241: shm_example_simple_lap_z_facto4_sched1_kwayprojections_svdbegin ......... Passed 146.05 sec Start 1654: shm_example_simple_lap_z_facto1_sched4_not_svdend 1143/3626 Test #1163: shm_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpbegin .......***Timeout 200.16 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1163: shm_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpbegin 1143/3626 Test #1165: shm_example_simple_lap_z_facto1_sched1_not_rqrrtbegin ...................***Timeout 200.02 sec Start 1165: shm_example_simple_lap_z_facto1_sched1_not_rqrrtbegin Test #949: shm_example_simple_lap_c_facto0_sched1_not_svdbegin ..................... Passed 177.11 sec Start 1655: shm_example_simple_lap_z_facto1_sched4_kway_svdbegin 1144/3626 Test #1249: shm_example_simple_lap_z_facto4_sched1_not_rqrcpbegin ................... Passed 146.12 sec Start 1656: shm_example_simple_lap_z_facto1_sched4_kway_svdend 1145/3626 Test #1178: shm_example_simple_lap_z_facto2_sched1_kwayprojections_svdend ...........***Timeout 200.54 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1178: shm_example_simple_lap_z_facto2_sched1_kwayprojections_svdend 1145/3626 Test #1262: shm_example_simple_lap_z_facto4_sched1_not_rqrrtend ..................... Passed 139.87 sec Start 1657: shm_example_simple_lap_z_facto1_sched4_kwayprojections_svdbegin 1146/3626 Test #1213: shm_example_simple_lap_z_facto3_sched1_kway_pqrcpbegin .................. Passed 172.86 sec Start 1658: shm_example_simple_lap_z_facto1_sched4_kwayprojections_svdend 1147/3626 Test #1209: shm_example_simple_lap_z_facto3_sched1_kwayprojections_svdbegin ......... Passed 174.08 sec Start 1659: shm_example_simple_lap_z_facto1_sched4_not_pqrcpbegin 1148/3626 Test #1217: shm_example_simple_lap_z_facto3_sched1_not_rqrcpbegin ................... Passed 172.15 sec Start 1660: shm_example_simple_lap_z_facto1_sched4_not_pqrcpend 1149/3626 Test #1207: shm_example_simple_lap_z_facto3_sched1_kway_svdbegin .................... Passed 175.36 sec Start 1661: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpbegin 1150/3626 Test #1280: shm_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpend ......... Passed 133.38 sec Start 1662: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpend 1151/3626 Test #1187: shm_example_simple_lap_z_facto2_sched1_kway_rqrcpbegin ..................***Timeout 200.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1187: shm_example_simple_lap_z_facto2_sched1_kway_rqrcpbegin Test #1001: shm_example_simple_lap_c_facto1_sched1_kway_tqrcpbegin .................. Passed 147.66 sec Start 1663: shm_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpbegin 1152/3626 Test #1269: shm_example_simple_lap_s_facto0_sched4_not_svdbegin ..................... Passed 142.74 sec Start 1664: shm_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpend 1153/3626 Test #1259: shm_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpbegin ....... Passed 146.42 sec Start 1665: shm_example_simple_lap_z_facto1_sched4_not_rqrcpbegin Test #992: shm_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpend ......... Passed 157.45 sec Start 1666: shm_example_simple_lap_z_facto1_sched4_not_rqrcpend 1155/3626 Test #1285: shm_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpbegin ....... Passed 138.20 sec Start 1667: shm_example_simple_lap_z_facto1_sched4_kway_rqrcpbegin 1156/3626 Test #1278: shm_example_simple_lap_s_facto0_sched4_kway_pqrcpend .................... Passed 140.43 sec Start 1668: shm_example_simple_lap_z_facto1_sched4_kway_rqrcpend 1157/3626 Test #1275: shm_example_simple_lap_s_facto0_sched4_not_pqrcpbegin ................... Passed 142.74 sec Start 1669: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpbegin 1158/3626 Test #1195: shm_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpbegin .......***Timeout 200.03 sec Start 1195: shm_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpbegin 1158/3626 Test #1254: shm_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpend ......... Passed 157.71 sec Start 1670: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpend 1159/3626 Test #1266: shm_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtend ......... Passed 147.48 sec Start 1671: shm_example_simple_lap_z_facto1_sched4_not_tqrcpbegin 1160/3626 Test #1216: shm_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpend ......... Passed 181.46 sec Start 1672: shm_example_simple_lap_z_facto1_sched4_not_tqrcpend 1161/3626 Test #1261: shm_example_simple_lap_z_facto4_sched1_not_rqrrtbegin ................... Passed 151.94 sec Start 1673: shm_example_simple_lap_z_facto1_sched4_kway_tqrcpbegin 1162/3626 Test #1212: shm_example_simple_lap_z_facto3_sched1_not_pqrcpend ..................... Passed 185.54 sec Start 1674: shm_example_simple_lap_z_facto1_sched4_kway_tqrcpend 1163/3626 Test #1253: shm_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpbegin ....... Passed 161.02 sec Start 1675: shm_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpbegin 1164/3626 Test #1276: shm_example_simple_lap_s_facto0_sched4_not_pqrcpend ..................... Passed 146.38 sec Start 1676: shm_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpend 1165/3626 Test #1227: shm_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpbegin ....... Passed 173.84 sec Start 1677: shm_example_simple_lap_z_facto1_sched4_not_rqrrtbegin 1166/3626 Test #1258: shm_example_simple_lap_z_facto4_sched1_kway_tqrcpend .................... Passed 156.67 sec Start 1678: shm_example_simple_lap_z_facto1_sched4_not_rqrrtend 1167/3626 Test #1255: shm_example_simple_lap_z_facto4_sched1_not_tqrcpbegin ................... Passed 161.18 sec Start 1679: shm_example_simple_lap_z_facto1_sched4_kway_rqrrtbegin 1168/3626 Test #1271: shm_example_simple_lap_s_facto0_sched4_kway_svdbegin .................... Passed 149.91 sec Start 1680: shm_example_simple_lap_z_facto1_sched4_kway_rqrrtend 1169/3626 Test #1292: shm_example_simple_lap_s_facto0_sched4_kwayprojections_tqrcpend ......... Passed 141.37 sec Start 1681: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtbegin 1170/3626 Test #1268: shm_example_simple_lap_z_facto4_sched1_kway_pqrcpilu1 ................... Passed 150.56 sec Start 1682: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtend 1171/3626 Test #1288: shm_example_simple_lap_s_facto0_sched4_not_tqrcpend ..................... Passed 142.88 sec Start 1683: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpilu0 1172/3626 Test #1281: shm_example_simple_lap_s_facto0_sched4_not_rqrcpbegin ................... Passed 145.24 sec Start 1684: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpilu1 1173/3626 Test #1265: shm_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtbegin ....... Passed 152.05 sec Start 1685: shm_example_simple_lap_z_facto2_sched4_not_svdbegin 1174/3626 Test #1231: shm_example_simple_lap_z_facto3_sched1_kway_rqrrtbegin .................. Passed 170.04 sec Start 1686: shm_example_simple_lap_z_facto2_sched4_not_svdend 1175/3626 Test #1272: shm_example_simple_lap_s_facto0_sched4_kway_svdend ...................... Passed 150.56 sec Start 1687: shm_example_simple_lap_z_facto2_sched4_kway_svdbegin 1176/3626 Test #1270: shm_example_simple_lap_s_facto0_sched4_not_svdend ....................... Passed 151.30 sec Start 1688: shm_example_simple_lap_z_facto2_sched4_kway_svdend 1177/3626 Test #1284: shm_example_simple_lap_s_facto0_sched4_kway_rqrcpend .................... Passed 145.71 sec Start 1689: shm_example_simple_lap_z_facto2_sched4_kwayprojections_svdbegin 1178/3626 Test #1239: shm_example_simple_lap_z_facto4_sched1_kway_svdbegin .................... Passed 168.81 sec Start 1690: shm_example_simple_lap_z_facto2_sched4_kwayprojections_svdend Test #687: shm_example_simple_lap_z_facto2_sched0_not_rqrrtend ..................... Passed 155.19 sec Start 1691: shm_example_simple_lap_z_facto2_sched4_not_pqrcpbegin 1180/3626 Test #1264: shm_example_simple_lap_z_facto4_sched1_kway_rqrrtend .................... Passed 153.33 sec Start 1692: shm_example_simple_lap_z_facto2_sched4_not_pqrcpend Test #973: shm_example_simple_lap_c_facto0_sched1_not_rqrrtbegin ................... Passed 180.40 sec Start 1693: shm_example_simple_lap_z_facto2_sched4_kway_pqrcpbegin 1182/3626 Test #1244: shm_example_simple_lap_z_facto4_sched1_not_pqrcpend ..................... Passed 167.71 sec Start 1694: shm_example_simple_lap_z_facto2_sched4_kway_pqrcpend Test #786: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtbegin ....... Passed 138.89 sec Start 1695: shm_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpbegin 1184/3626 Test #1287: shm_example_simple_lap_s_facto0_sched4_not_tqrcpbegin ................... Passed 146.61 sec Start 1696: shm_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpend 1185/3626 Test #1205: shm_example_simple_lap_z_facto3_sched1_not_svdbegin ..................... Passed 192.11 sec Start 1697: shm_example_simple_lap_z_facto2_sched4_not_rqrcpbegin 1186/3626 Test #1203: shm_example_simple_lap_z_facto2_sched1_kway_pqrcpilu0 ...................***Timeout 200.03 sec Start 1203: shm_example_simple_lap_z_facto2_sched1_kway_pqrcpilu0 Test #950: shm_example_simple_lap_c_facto0_sched1_not_svdend .......................***Timeout 200.02 sec Start 950: shm_example_simple_lap_c_facto0_sched1_not_svdend 1186/3626 Test #1256: shm_example_simple_lap_z_facto4_sched1_not_tqrcpend ..................... Passed 169.62 sec Start 1698: shm_example_simple_lap_z_facto2_sched4_not_rqrcpend Test #777: shm_example_simple_lap_s_facto0_sched1_not_tqrcpend ..................... Passed 147.99 sec Start 1699: shm_example_simple_lap_z_facto2_sched4_kway_rqrcpbegin 1188/3626 Test #1206: shm_example_simple_lap_z_facto3_sched1_not_svdend .......................***Timeout 200.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1206: shm_example_simple_lap_z_facto3_sched1_not_svdend 1188/3626 Test #1291: shm_example_simple_lap_s_facto0_sched4_kwayprojections_tqrcpbegin ....... Passed 153.53 sec Start 1700: shm_example_simple_lap_z_facto2_sched4_kway_rqrcpend Test #767: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpend .................... Passed 148.66 sec Start 1701: shm_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpbegin Test #784: shm_example_simple_lap_s_facto0_sched1_kway_rqrrtbegin .................. Passed 148.26 sec Start 1702: shm_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpend Test #1026: shm_example_simple_lap_c_facto2_sched1_not_rqrcpend ..................... Passed 165.01 sec Start 1703: shm_example_simple_lap_z_facto2_sched4_not_tqrcpbegin 1192/3626 Test #1267: shm_example_simple_lap_z_facto4_sched1_kway_pqrcpilu0 ................... Passed 163.43 sec Start 1704: shm_example_simple_lap_z_facto2_sched4_not_tqrcpend 1193/3626 Test #1299: shm_example_simple_lap_s_facto0_sched4_kway_pqrcpilu0 ................... Passed 145.64 sec Start 1705: shm_example_simple_lap_z_facto2_sched4_kway_tqrcpbegin 1194/3626 Test #1211: shm_example_simple_lap_z_facto3_sched1_not_pqrcpbegin ................... Passed 199.77 sec Start 1706: shm_example_simple_lap_z_facto2_sched4_kway_tqrcpend 1195/3626 Test #1300: shm_example_simple_lap_s_facto0_sched4_kway_pqrcpilu1 ................... Passed 146.23 sec Start 1707: shm_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpbegin 1196/3626 Test #1277: shm_example_simple_lap_s_facto0_sched4_kway_pqrcpbegin .................. Passed 159.98 sec Start 1708: shm_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpend Test #982: shm_example_simple_lap_c_facto1_sched1_not_svdend ....................... Passed 180.47 sec Start 1709: shm_example_simple_lap_z_facto2_sched4_not_rqrrtbegin 1198/3626 Test #1298: shm_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtend ......... Passed 147.05 sec Start 1710: shm_example_simple_lap_z_facto2_sched4_not_rqrrtend Test #794: shm_example_simple_lap_s_facto1_sched1_kwayprojections_svdbegin ......... Passed 151.44 sec Start 1711: shm_example_simple_lap_z_facto2_sched4_kway_rqrrtbegin 1200/3626 Test #1260: shm_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpend ......... Passed 172.72 sec Start 1712: shm_example_simple_lap_z_facto2_sched4_kway_rqrrtend 1201/3626 Test #1223: shm_example_simple_lap_z_facto3_sched1_not_tqrcpbegin ...................***Timeout 200.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1223: shm_example_simple_lap_z_facto3_sched1_not_tqrcpbegin Test #762: shm_example_simple_lap_s_facto0_sched1_kwayprojections_svdbegin ......... Passed 159.33 sec Start 1713: shm_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtbegin 1202/3626 Test #1250: shm_example_simple_lap_z_facto4_sched1_not_rqrcpend ..................... Passed 185.60 sec Start 1714: shm_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtend 1203/3626 Test #1286: shm_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpend ......... Passed 167.91 sec Start 1715: shm_example_simple_lap_z_facto2_sched4_kway_pqrcpilu0 1204/3626 Test #1225: shm_example_simple_lap_z_facto3_sched1_kway_tqrcpbegin ..................***Timeout 200.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1225: shm_example_simple_lap_z_facto3_sched1_kway_tqrcpbegin Test #750: shm_example_simple_lap_z_facto4_sched0_not_rqrrtbegin ................... Passed 161.89 sec Start 1716: shm_example_simple_lap_z_facto2_sched4_kway_pqrcpilu1 Test #711: shm_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpend ......... Passed 179.47 sec Start 1717: shm_example_simple_lap_z_facto3_sched4_not_svdbegin Test #744: shm_example_simple_lap_z_facto4_sched0_not_tqrcpbegin ................... Passed 160.98 sec Start 1718: shm_example_simple_lap_z_facto3_sched4_not_svdend 1207/3626 Test #1226: shm_example_simple_lap_z_facto3_sched1_kway_tqrcpend ....................***Timeout 200.10 sec Start 1226: shm_example_simple_lap_z_facto3_sched1_kway_tqrcpend Test #752: shm_example_simple_lap_z_facto4_sched0_kway_rqrrtbegin .................. Passed 161.09 sec Start 1719: shm_example_simple_lap_z_facto3_sched4_kway_svdbegin Test #800: shm_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpbegin ....... Passed 161.92 sec Start 1720: shm_example_simple_lap_z_facto3_sched4_kway_svdend 1209/3626 Test #1301: shm_example_simple_lap_s_facto1_sched4_not_svdbegin ..................... Passed 159.13 sec 1210/3626 Test #1315: shm_example_simple_lap_s_facto1_sched4_kway_rqrcpbegin .................. Passed 157.66 sec Start 1721: shm_example_simple_lap_z_facto3_sched4_kwayprojections_svdbegin Start 1722: shm_example_simple_lap_z_facto3_sched4_kwayprojections_svdend 1211/3626 Test #1296: shm_example_simple_lap_s_facto0_sched4_kway_rqrrtend .................... Passed 159.56 sec Start 1723: shm_example_simple_lap_z_facto3_sched4_not_pqrcpbegin Test #764: shm_example_simple_lap_s_facto0_sched1_not_pqrcpbegin ................... Passed 163.02 sec Start 1724: shm_example_simple_lap_z_facto3_sched4_not_pqrcpend 1213/3626 Test #1293: shm_example_simple_lap_s_facto0_sched4_not_rqrrtbegin ................... Passed 160.19 sec Start 1725: shm_example_simple_lap_z_facto3_sched4_kway_pqrcpbegin Test #782: shm_example_simple_lap_s_facto0_sched1_not_rqrrtbegin ................... Passed 162.69 sec Start 1726: shm_example_simple_lap_z_facto3_sched4_kway_pqrcpend 1215/3626 Test #1263: shm_example_simple_lap_z_facto4_sched1_kway_rqrrtbegin .................. Passed 179.29 sec Start 1727: shm_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpbegin Test #802: shm_example_simple_lap_s_facto1_sched1_not_rqrcpbegin ................... Passed 161.58 sec Start 1728: shm_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpend Test #774: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpbegin ....... Passed 161.62 sec Start 1729: shm_example_simple_lap_z_facto3_sched4_not_rqrcpbegin 1218/3626 Test #1237: shm_example_simple_lap_z_facto4_sched1_not_svdbegin ..................... Passed 196.27 sec Start 1730: shm_example_simple_lap_z_facto3_sched4_not_rqrcpend 1219/3626 Test #1343: shm_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpbegin ....... Passed 156.57 sec Start 1731: shm_example_simple_lap_z_facto3_sched4_kway_rqrcpbegin 1220/3626 Test #1232: shm_example_simple_lap_z_facto3_sched1_kway_rqrrtend ....................***Timeout 200.03 sec Start 1232: shm_example_simple_lap_z_facto3_sched1_kway_rqrrtend 1220/3626 Test #1347: shm_example_simple_lap_s_facto2_sched4_kway_rqrcpbegin .................. Passed 156.75 sec Start 1732: shm_example_simple_lap_z_facto3_sched4_kway_rqrcpend 1221/3626 Test #1235: shm_example_simple_lap_z_facto3_sched1_kway_pqrcpilu0 ...................***Timeout 200.07 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1235: shm_example_simple_lap_z_facto3_sched1_kway_pqrcpilu0 1221/3626 Test #1251: shm_example_simple_lap_z_facto4_sched1_kway_rqrcpbegin .................. Passed 193.34 sec Start 1733: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpbegin 1222/3626 Test #1257: shm_example_simple_lap_z_facto4_sched1_kway_tqrcpbegin .................. Passed 191.20 sec Start 1734: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpend 1223/3626 Test #1240: shm_example_simple_lap_z_facto4_sched1_kway_svdend ......................***Timeout 200.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1240: shm_example_simple_lap_z_facto4_sched1_kway_svdend 1223/3626 Test #1242: shm_example_simple_lap_z_facto4_sched1_kwayprojections_svdend ...........***Timeout 200.32 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1242: shm_example_simple_lap_z_facto4_sched1_kwayprojections_svdend 1223/3626 Test #1393: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtbegin ....... Passed 155.01 sec Start 1735: shm_example_simple_lap_z_facto3_sched4_not_tqrcpbegin Test #790: shm_example_simple_lap_s_facto1_sched1_not_svdbegin ..................... Passed 168.13 sec Start 1736: shm_example_simple_lap_z_facto3_sched4_not_tqrcpend 1225/3626 Test #1314: shm_example_simple_lap_s_facto1_sched4_not_rqrcpend ..................... Passed 165.73 sec Start 1737: shm_example_simple_lap_z_facto3_sched4_kway_tqrcpbegin 1226/3626 Test #1273: shm_example_simple_lap_s_facto0_sched4_kwayprojections_svdbegin ......... Passed 183.00 sec Start 1738: shm_example_simple_lap_z_facto3_sched4_kway_tqrcpend 1227/3626 Test #1386: shm_example_simple_lap_d_facto0_sched4_kway_tqrcpend .................... Passed 156.93 sec Start 1739: shm_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpbegin 1228/3626 Test #1318: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpend ......... Passed 166.31 sec Start 1740: shm_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpend 1229/3626 Test #1282: shm_example_simple_lap_s_facto0_sched4_not_rqrcpend ..................... Passed 179.89 sec Start 1741: shm_example_simple_lap_z_facto3_sched4_not_rqrrtbegin 1230/3626 Test #1243: shm_example_simple_lap_z_facto4_sched1_not_pqrcpbegin ...................***Timeout 200.23 sec Start 1243: shm_example_simple_lap_z_facto4_sched1_not_pqrcpbegin 1230/3626 Test #1294: shm_example_simple_lap_s_facto0_sched4_not_rqrrtend ..................... Passed 168.86 sec Start 1742: shm_example_simple_lap_z_facto3_sched4_not_rqrrtend 1231/3626 Test #1245: shm_example_simple_lap_z_facto4_sched1_kway_pqrcpbegin ..................***Timeout 200.17 sec Start 1245: shm_example_simple_lap_z_facto4_sched1_kway_pqrcpbegin 1231/3626 Test #1368: shm_example_simple_lap_d_facto0_sched4_kway_svdend ...................... Passed 159.74 sec 1232/3626 Test #1246: shm_example_simple_lap_z_facto4_sched1_kway_pqrcpend ....................***Timeout 200.19 sec Start 1246: shm_example_simple_lap_z_facto4_sched1_kway_pqrcpend Start 1743: shm_example_simple_lap_z_facto3_sched4_kway_rqrrtbegin 1232/3626 Test #1369: shm_example_simple_lap_d_facto0_sched4_kwayprojections_svdbegin ......... Passed 159.76 sec Start 1744: shm_example_simple_lap_z_facto3_sched4_kway_rqrrtend Test #738: shm_example_simple_lap_z_facto4_sched0_not_rqrcpbegin ................... Passed 173.23 sec Start 1745: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtbegin 1234/3626 Test #1248: shm_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpend .........***Timeout 200.08 sec Start 1248: shm_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpend 1234/3626 Test #1319: shm_example_simple_lap_s_facto1_sched4_not_tqrcpbegin ................... Passed 167.29 sec Start 1746: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtend 1235/3626 Test #1355: shm_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpbegin ....... Passed 161.74 sec Start 1747: shm_example_simple_lap_z_facto3_sched4_kway_pqrcpilu0 1236/3626 Test #1388: shm_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpend ......... Passed 158.15 sec Start 1748: shm_example_simple_lap_z_facto3_sched4_kway_pqrcpilu1 Test #801: shm_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpend ......... Passed 171.23 sec Start 1749: shm_example_simple_lap_z_facto4_sched4_not_svdbegin 1238/3626 Test #1345: shm_example_simple_lap_s_facto2_sched4_not_rqrcpbegin ................... Passed 163.01 sec Start 1750: shm_example_simple_lap_z_facto4_sched4_not_svdend 1239/3626 Test #1359: shm_example_simple_lap_s_facto2_sched4_kway_rqrrtbegin .................. Passed 161.74 sec Start 1751: shm_example_simple_lap_z_facto4_sched4_kway_svdbegin 1240/3626 Test #1342: shm_example_simple_lap_s_facto2_sched4_kway_pqrcpend .................... Passed 163.34 sec Start 1752: shm_example_simple_lap_z_facto4_sched4_kway_svdend 1241/3626 Test #1305: shm_example_simple_lap_s_facto1_sched4_kwayprojections_svdbegin ......... Passed 169.24 sec Start 1753: shm_example_simple_lap_z_facto4_sched4_kwayprojections_svdbegin 1242/3626 Test #1338: shm_example_simple_lap_s_facto2_sched4_kwayprojections_svdend ........... Passed 164.60 sec Start 1754: shm_example_simple_lap_z_facto4_sched4_kwayprojections_svdend 1243/3626 Test #1350: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpend ......... Passed 163.65 sec Start 1755: shm_example_simple_lap_z_facto4_sched4_not_pqrcpbegin Test #1023: shm_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpbegin ....... Passed 190.72 sec Start 1756: shm_example_simple_lap_z_facto4_sched4_not_pqrcpend 1245/3626 Test #1354: shm_example_simple_lap_s_facto2_sched4_kway_tqrcpend .................... Passed 163.88 sec Start 1757: shm_example_simple_lap_z_facto4_sched4_kway_pqrcpbegin Test #798: shm_example_simple_lap_s_facto1_sched1_kway_pqrcpbegin .................. Passed 174.57 sec Start 1758: shm_example_simple_lap_z_facto4_sched4_kway_pqrcpend Test #792: shm_example_simple_lap_s_facto1_sched1_kway_svdbegin .................... Passed 173.31 sec Start 1759: shm_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpbegin 1248/3626 Test #1312: shm_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpend ......... Passed 171.13 sec Start 1760: shm_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpend Test #770: shm_example_simple_lap_s_facto0_sched1_not_rqrcpbegin ................... Passed 174.15 sec Start 1761: shm_example_simple_lap_z_facto4_sched4_not_rqrcpbegin Test #781: shm_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpend ......... Passed 174.97 sec Start 1762: shm_example_simple_lap_z_facto4_sched4_not_rqrcpend 1251/3626 Test #1303: shm_example_simple_lap_s_facto1_sched4_kway_svdbegin .................... Passed 174.73 sec Start 1763: shm_example_simple_lap_z_facto4_sched4_kway_rqrcpbegin 1252/3626 Test #1302: shm_example_simple_lap_s_facto1_sched4_not_svdend ....................... Passed 175.62 sec Start 1764: shm_example_simple_lap_z_facto4_sched4_kway_rqrcpend 1253/3626 Test #1352: shm_example_simple_lap_s_facto2_sched4_not_tqrcpend ..................... Passed 168.45 sec Start 1765: shm_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpbegin 1254/3626 Test #1337: shm_example_simple_lap_s_facto2_sched4_kwayprojections_svdbegin ......... Passed 169.88 sec Start 1766: shm_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpend 1255/3626 Test #1333: shm_example_simple_lap_s_facto2_sched4_not_svdbegin ..................... Passed 173.44 sec Start 1767: shm_example_simple_lap_z_facto4_sched4_not_tqrcpbegin 1256/3626 Test #1348: shm_example_simple_lap_s_facto2_sched4_kway_rqrcpend .................... Passed 169.31 sec Start 1768: shm_example_simple_lap_z_facto4_sched4_not_tqrcpend Test #826: shm_example_simple_lap_s_facto2_sched1_kwayprojections_svdbegin ......... Passed 156.09 sec Start 1769: shm_example_simple_lap_z_facto4_sched4_kway_tqrcpbegin 1258/3626 Test #1399: shm_example_simple_lap_d_facto1_sched4_kway_svdbegin .................... Passed 163.55 sec Start 1770: shm_example_simple_lap_z_facto4_sched4_kway_tqrcpend Test #780: shm_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpbegin ....... Passed 180.82 sec Start 1771: shm_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpbegin 1260/3626 Test #1344: shm_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpend ......... Passed 173.12 sec Start 1772: shm_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpend 1261/3626 Test #1324: shm_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpend ......... Passed 177.61 sec 1262/3626 Test #1364: shm_example_simple_lap_s_facto2_sched4_kway_pqrcpilu1 ................... Passed 171.19 sec Start 1773: shm_example_simple_lap_z_facto4_sched4_not_rqrrtbegin Start 1774: shm_example_simple_lap_z_facto4_sched4_not_rqrrtend 1263/3626 Test #1274: shm_example_simple_lap_s_facto0_sched4_kwayprojections_svdend ........... Passed 195.40 sec Start 1775: shm_example_simple_lap_z_facto4_sched4_kway_rqrrtbegin Test #776: shm_example_simple_lap_s_facto0_sched1_not_tqrcpbegin ................... Passed 181.91 sec Start 1776: shm_example_simple_lap_z_facto4_sched4_kway_rqrrtend 1265/3626 Test #1410: shm_example_simple_lap_d_facto1_sched4_not_rqrcpend ..................... Passed 165.16 sec Start 1777: shm_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtbegin Test #769: shm_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpend ......... Passed 184.08 sec Start 1778: shm_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtend 1267/3626 Test #1321: shm_example_simple_lap_s_facto1_sched4_kway_tqrcpbegin .................. Passed 178.76 sec 1268/3626 Test #1452: shm_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpend ......... Passed 158.29 sec Start 1779: shm_example_simple_lap_z_facto4_sched4_kway_pqrcpilu0 Start 1780: shm_example_simple_lap_z_facto4_sched4_kway_pqrcpilu1 Test #743: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpend ......... Passed 185.34 sec Start 1781: c_mpi_rep_example_analyze_lap_s_facto0 1270/3626 Test #1387: shm_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpbegin ....... Passed 170.70 sec Start 1782: c_mpi_rep_example_analyze_lap_s_facto1 1271/3626 Test #1371: shm_example_simple_lap_d_facto0_sched4_not_pqrcpbegin ................... Passed 172.72 sec Start 1783: c_mpi_rep_example_analyze_lap_s_facto2 1272/3626 Test #1353: shm_example_simple_lap_s_facto2_sched4_kway_tqrcpbegin .................. Passed 174.58 sec Start 1784: c_mpi_rep_example_analyze_lap_d_facto0 1273/3626 Test #1379: shm_example_simple_lap_d_facto0_sched4_kway_rqrcpbegin .................. Passed 174.47 sec Start 1785: c_mpi_rep_example_analyze_lap_d_facto1 1274/3626 Test #1375: shm_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpbegin ....... Passed 175.30 sec Start 1786: c_mpi_rep_example_analyze_lap_d_facto2 1275/3626 Test #1403: shm_example_simple_lap_d_facto1_sched4_not_pqrcpbegin ................... Passed 171.15 sec Start 1787: c_mpi_rep_example_analyze_lap_c_facto0 1276/3626 Test #1279: shm_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpbegin .......***Timeout 200.45 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1279: shm_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpbegin 1276/3626 Test #1389: shm_example_simple_lap_d_facto0_sched4_not_rqrrtbegin ................... Passed 177.28 sec Start 1788: c_mpi_rep_example_analyze_lap_c_facto1 1277/3626 Test #1283: shm_example_simple_lap_s_facto0_sched4_kway_rqrcpbegin ..................***Timeout 200.38 sec Start 1283: shm_example_simple_lap_s_facto0_sched4_kway_rqrcpbegin 1277/3626 Test #1304: shm_example_simple_lap_s_facto1_sched4_kway_svdend ...................... Passed 188.19 sec Start 1789: c_mpi_rep_example_analyze_lap_c_facto2 1278/3626 Test #1382: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpend ......... Passed 178.66 sec Start 1790: c_mpi_rep_example_analyze_lap_c_facto3 1279/3626 Test #1404: shm_example_simple_lap_d_facto1_sched4_not_pqrcpend ..................... Passed 174.44 sec 1280/3626 Test #1406: shm_example_simple_lap_d_facto1_sched4_kway_pqrcpend .................... Passed 174.12 sec Start 1791: c_mpi_rep_example_analyze_lap_c_facto4 Start 1792: c_mpi_rep_example_analyze_lap_z_facto0 1281/3626 Test #1390: shm_example_simple_lap_d_facto0_sched4_not_rqrrtend ..................... Passed 178.05 sec Start 1793: c_mpi_rep_example_analyze_lap_z_facto1 1282/3626 Test #1308: shm_example_simple_lap_s_facto1_sched4_not_pqrcpend ..................... Passed 188.60 sec Start 1794: c_mpi_rep_example_analyze_lap_z_facto2 Test #768: shm_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpbegin ....... Passed 191.72 sec Start 1795: c_mpi_rep_example_analyze_lap_z_facto3 1284/3626 Test #1290: shm_example_simple_lap_s_facto0_sched4_kway_tqrcpend .................... Passed 199.06 sec Start 1796: c_mpi_rep_example_analyze_lap_z_facto4 1285/3626 Test #1411: shm_example_simple_lap_d_facto1_sched4_kway_rqrcpbegin .................. Passed 168.52 sec Start 1797: c_mpi_rep_example_simple_lap_s_facto0 1286/3626 Test #1295: shm_example_simple_lap_s_facto0_sched4_kway_rqrrtbegin .................. Passed 190.84 sec Start 1798: c_mpi_rep_example_simple_lap_s_facto1 Test #773: shm_example_simple_lap_s_facto0_sched1_kway_rqrcpend .................... Passed 193.87 sec Start 1799: c_mpi_rep_example_simple_lap_s_facto2 1288/3626 Test #1289: shm_example_simple_lap_s_facto0_sched4_kway_tqrcpbegin ..................***Timeout 200.24 sec Start 1289: shm_example_simple_lap_s_facto0_sched4_kway_tqrcpbegin Test #785: shm_example_simple_lap_s_facto0_sched1_kway_rqrrtend .................... Passed 194.09 sec 1289/3626 Test #1297: shm_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtbegin ....... Passed 190.92 sec Test #1040: shm_example_simple_lap_c_facto2_sched1_kway_rqrrtend .................... Passed 169.44 sec 1291/3626 Test #1486: shm_example_simple_lap_c_facto0_sched4_not_rqrrtend ..................... Passed 168.02 sec Start 1800: c_mpi_rep_example_simple_lap_d_facto0 Start 1801: c_mpi_rep_example_simple_lap_d_facto1 Start 1802: c_mpi_rep_example_simple_lap_d_facto2 Start 1803: c_mpi_rep_example_simple_lap_c_facto0 1292/3626 Test #1402: shm_example_simple_lap_d_facto1_sched4_kwayprojections_svdend ........... Passed 177.05 sec 1293/3626 Test #1508: shm_example_simple_lap_c_facto1_sched4_kway_rqrcpend .................... Passed 167.52 sec Start 1804: c_mpi_rep_example_simple_lap_c_facto1 Start 1805: c_mpi_rep_example_simple_lap_c_facto2 1294/3626 Test #1412: shm_example_simple_lap_d_facto1_sched4_kway_rqrcpend .................... Passed 169.71 sec Start 1806: c_mpi_rep_example_simple_lap_c_facto3 1295/3626 Test #1422: shm_example_simple_lap_d_facto1_sched4_not_rqrrtend ..................... Passed 169.69 sec Start 1807: c_mpi_rep_example_simple_lap_c_facto4 1296/3626 Test #1306: shm_example_simple_lap_s_facto1_sched4_kwayprojections_svdend ........... Passed 190.52 sec Start 1808: c_mpi_rep_example_simple_lap_z_facto0 1297/3626 Test #1462: shm_example_simple_lap_c_facto0_sched4_not_svdend ....................... Passed 169.10 sec Start 1809: c_mpi_rep_example_simple_lap_z_facto1 1298/3626 Test #1408: shm_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpend ......... Passed 177.15 sec Start 1810: c_mpi_rep_example_simple_lap_z_facto2 Test #748: shm_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpbegin ....... Passed 196.84 sec Start 1811: c_mpi_rep_example_simple_lap_z_facto3 1300/3626 Test #1454: shm_example_simple_lap_d_facto2_sched4_not_rqrrtend ..................... Passed 171.50 sec Start 1812: c_mpi_rep_example_simple_lap_z_facto4 1301/3626 Test #1519: shm_example_simple_lap_c_facto1_sched4_kway_rqrrtbegin .................. Passed 167.86 sec Start 1813: c_mpi_rep_example_simple_solve_and_refine_lap_s_facto0 1302/3626 Test #1507: shm_example_simple_lap_c_facto1_sched4_kway_rqrcpbegin .................. Passed 170.64 sec Start 1814: c_mpi_rep_example_simple_solve_and_refine_lap_s_facto1 1303/3626 Test #1351: shm_example_simple_lap_s_facto2_sched4_not_tqrcpbegin ................... Passed 187.08 sec Start 1815: c_mpi_rep_example_simple_solve_and_refine_lap_s_facto2 Test #783: shm_example_simple_lap_s_facto0_sched1_not_rqrrtend ..................... Passed 197.58 sec 1305/3626 Test #1356: shm_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpend ......... Passed 186.67 sec 1306/3626 Test #1383: shm_example_simple_lap_d_facto0_sched4_not_tqrcpbegin ................... Passed 184.05 sec 1307/3626 Test #1499: shm_example_simple_lap_c_facto1_sched4_not_pqrcpbegin ................... Passed 170.95 sec Start 1816: c_mpi_rep_example_simple_solve_and_refine_lap_d_facto0 Start 1817: c_mpi_rep_example_simple_solve_and_refine_lap_d_facto1 Start 1818: c_mpi_rep_example_simple_solve_and_refine_lap_d_facto2 Start 1819: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto0 1308/3626 Test #1322: shm_example_simple_lap_s_facto1_sched4_kway_tqrcpend .................... Passed 192.47 sec Start 1820: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto1 1309/3626 Test #1445: shm_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpbegin ....... Passed 172.56 sec 1310/3626 Test #1466: shm_example_simple_lap_c_facto0_sched4_kwayprojections_svdend ........... Passed 171.88 sec Start 1821: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto2 Start 1822: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto3 1311/3626 Test #1357: shm_example_simple_lap_s_facto2_sched4_not_rqrrtbegin ................... Passed 187.15 sec Start 1823: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto4 1312/3626 Test #1509: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpbegin ....... Passed 171.42 sec Start 1824: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto0 1313/3626 Test #1494: shm_example_simple_lap_c_facto1_sched4_not_svdend ....................... Passed 172.03 sec Start 1825: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto1 Test #739: shm_example_simple_lap_z_facto4_sched0_not_rqrcpend .....................***Timeout 199.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Test #742: shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpbegin .......***Timeout 199.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Test #791: shm_example_simple_lap_s_facto1_sched1_not_svdend .......................***Timeout 198.86 sec Test #793: shm_example_simple_lap_s_facto1_sched1_kway_svdend ......................***Timeout 198.83 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Test #795: shm_example_simple_lap_s_facto1_sched1_kwayprojections_svdend ...........***Timeout 198.80 sec Test #796: shm_example_simple_lap_s_facto1_sched1_not_pqrcpbegin ...................***Timeout 198.77 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Test #766: shm_example_simple_lap_s_facto0_sched1_kway_pqrcpbegin ..................***Timeout 198.01 sec 1321/3626 Test #1467: shm_example_simple_lap_c_facto0_sched4_not_pqrcpbegin ................... Passed 173.26 sec 1322/3626 Test #1521: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtbegin ....... Passed 169.20 sec Start 1826: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto2 Start 1827: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto3 Start 1828: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto4 Start 1829: c_mpi_rep_example_simple_trans_lap_s_facto0 Start 1830: c_mpi_rep_example_simple_trans_lap_s_facto1 Start 1831: c_mpi_rep_example_simple_trans_lap_s_facto2 Start 1832: c_mpi_rep_example_simple_trans_lap_d_facto0 Start 1833: c_mpi_rep_example_simple_trans_lap_d_facto1 Start 1834: c_mpi_rep_example_simple_trans_lap_d_facto2 Test #787: shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtend .........***Timeout 197.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1835: c_mpi_rep_example_simple_trans_lap_c_facto0 1324/3626 Test #1407: shm_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpbegin ....... Passed 181.13 sec Start 1836: c_mpi_rep_example_simple_trans_lap_c_facto1 1325/3626 Test #1449: shm_example_simple_lap_d_facto2_sched4_kway_tqrcpbegin .................. Passed 174.42 sec Start 1837: c_mpi_rep_example_simple_trans_lap_c_facto2 1326/3626 Test #1377: shm_example_simple_lap_d_facto0_sched4_not_rqrcpbegin ................... Passed 187.43 sec Start 1838: c_mpi_rep_example_simple_trans_lap_c_facto3 1327/3626 Test #1370: shm_example_simple_lap_d_facto0_sched4_kwayprojections_svdend ........... Passed 188.07 sec Start 1839: c_mpi_rep_example_simple_trans_lap_c_facto4 Test #824: shm_example_simple_lap_s_facto2_sched1_kway_svdbegin .................... Passed 177.16 sec Start 1840: c_mpi_rep_example_simple_trans_lap_z_facto0 1329/3626 Test #1505: shm_example_simple_lap_c_facto1_sched4_not_rqrcpbegin ................... Passed 173.73 sec Start 1841: c_mpi_rep_example_simple_trans_lap_z_facto1 1330/3626 Test #1488: shm_example_simple_lap_c_facto0_sched4_kway_rqrrtend .................... Passed 174.47 sec Start 1842: c_mpi_rep_example_simple_trans_lap_z_facto2 1331/3626 Test #1334: shm_example_simple_lap_s_facto2_sched4_not_svdend ....................... Passed 195.10 sec Start 1843: c_mpi_rep_example_simple_trans_lap_z_facto3 1332/3626 Test #1443: shm_example_simple_lap_d_facto2_sched4_kway_rqrcpbegin .................. Passed 176.74 sec Start 1844: c_mpi_rep_example_simple_trans_lap_z_facto4 1333/3626 Test #1307: shm_example_simple_lap_s_facto1_sched4_not_pqrcpbegin ...................***Timeout 199.50 sec Start 1307: shm_example_simple_lap_s_facto1_sched4_not_pqrcpbegin 1333/3626 Test #1309: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpbegin ..................***Timeout 199.57 sec Start 1309: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpbegin 1333/3626 Test #1310: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpend ....................***Timeout 199.58 sec Start 1310: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpend 1333/3626 Test #1311: shm_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpbegin .......***Timeout 199.51 sec Start 1311: shm_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpbegin 1333/3626 Test #1313: shm_example_simple_lap_s_facto1_sched4_not_rqrcpbegin ...................***Timeout 199.33 sec Start 1313: shm_example_simple_lap_s_facto1_sched4_not_rqrcpbegin 1333/3626 Test #1316: shm_example_simple_lap_s_facto1_sched4_kway_rqrcpend ....................***Timeout 199.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1316: shm_example_simple_lap_s_facto1_sched4_kway_rqrcpend 1333/3626 Test #1317: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpbegin .......***Timeout 199.36 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1317: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpbegin 1333/3626 Test #1323: shm_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpbegin .......***Timeout 198.94 sec Start 1323: shm_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpbegin 1333/3626 Test #1325: shm_example_simple_lap_s_facto1_sched4_not_rqrrtbegin ...................***Timeout 198.85 sec Start 1325: shm_example_simple_lap_s_facto1_sched4_not_rqrrtbegin 1333/3626 Test #1326: shm_example_simple_lap_s_facto1_sched4_not_rqrrtend .....................***Timeout 198.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1326: shm_example_simple_lap_s_facto1_sched4_not_rqrrtend 1333/3626 Test #1327: shm_example_simple_lap_s_facto1_sched4_kway_rqrrtbegin ..................***Timeout 198.95 sec Start 1327: shm_example_simple_lap_s_facto1_sched4_kway_rqrrtbegin 1333/3626 Test #1328: shm_example_simple_lap_s_facto1_sched4_kway_rqrrtend ....................***Timeout 198.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1328: shm_example_simple_lap_s_facto1_sched4_kway_rqrrtend 1333/3626 Test #1329: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtbegin .......***Timeout 198.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1329: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtbegin 1333/3626 Test #1330: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtend .........***Timeout 198.88 sec Start 1330: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtend 1333/3626 Test #1331: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpilu0 ...................***Timeout 198.83 sec Start 1331: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpilu0 1333/3626 Test #1320: shm_example_simple_lap_s_facto1_sched4_not_tqrcpend .....................***Timeout 199.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1320: shm_example_simple_lap_s_facto1_sched4_not_tqrcpend 1333/3626 Test #1332: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpilu1 ...................***Timeout 198.78 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1332: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpilu1 1333/3626 Test #1409: shm_example_simple_lap_d_facto1_sched4_not_rqrcpbegin ................... Passed 186.75 sec Start 1845: c_mpi_rep_example_step-by-step_lap_s_facto0 1334/3626 Test #1481: shm_example_simple_lap_c_facto0_sched4_kway_tqrcpbegin .................. Passed 179.40 sec Start 1846: c_mpi_rep_example_step-by-step_lap_s_facto1 Test #827: shm_example_simple_lap_s_facto2_sched1_kwayprojections_svdend ........... Passed 182.72 sec Start 1847: c_mpi_rep_example_step-by-step_lap_s_facto2 1336/3626 Test #1376: shm_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpend ......... Passed 193.93 sec Start 1848: c_mpi_rep_example_step-by-step_lap_d_facto0 1337/3626 Test #1396: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpilu1 ................... Passed 191.64 sec Start 1849: c_mpi_rep_example_step-by-step_lap_d_facto1 1338/3626 Test #1335: shm_example_simple_lap_s_facto2_sched4_kway_svdbegin ....................***Timeout 200.29 sec Start 1335: shm_example_simple_lap_s_facto2_sched4_kway_svdbegin 1338/3626 Test #1339: shm_example_simple_lap_s_facto2_sched4_not_pqrcpbegin ...................***Timeout 199.80 sec Start 1339: shm_example_simple_lap_s_facto2_sched4_not_pqrcpbegin 1338/3626 Test #1341: shm_example_simple_lap_s_facto2_sched4_kway_pqrcpbegin ..................***Timeout 200.01 sec Start 1341: shm_example_simple_lap_s_facto2_sched4_kway_pqrcpbegin 1338/3626 Test #1349: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpbegin .......***Timeout 199.78 sec Start 1349: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpbegin 1338/3626 Test #1415: shm_example_simple_lap_d_facto1_sched4_not_tqrcpbegin ................... Passed 185.60 sec 1339/3626 Test #1498: shm_example_simple_lap_c_facto1_sched4_kwayprojections_svdend ........... Passed 183.85 sec 1340/3626 Test #1336: shm_example_simple_lap_s_facto2_sched4_kway_svdend ......................***Timeout 201.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1336: shm_example_simple_lap_s_facto2_sched4_kway_svdend 1340/3626 Test #1340: shm_example_simple_lap_s_facto2_sched4_not_pqrcpend .....................***Timeout 200.84 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.421206e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.028915e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.684121e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.589508e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.549437e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.058470e-02 s Time to initialize coeftab 9.426653e-02 s Time to factorize 1.364229e+00 s ( 7.32 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 9.623217e-01 s Time for refinement 4.466530e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.925577e-07 max(|| b_i - A x_i ||_1) 8.297687e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.042658e+00 (SUCCESS) Start 1340: shm_example_simple_lap_s_facto2_sched4_not_pqrcpend 1340/3626 Test #1346: shm_example_simple_lap_s_facto2_sched4_not_rqrcpend .....................***Timeout 200.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1346: shm_example_simple_lap_s_facto2_sched4_not_rqrcpend 1340/3626 Test #1358: shm_example_simple_lap_s_facto2_sched4_not_rqrrtend .....................***Timeout 199.57 sec Start 1358: shm_example_simple_lap_s_facto2_sched4_not_rqrrtend Start 1850: c_mpi_rep_example_step-by-step_lap_d_facto2 Start 1851: c_mpi_rep_example_step-by-step_lap_c_facto0 1340/3626 Test #1384: shm_example_simple_lap_d_facto0_sched4_not_tqrcpend ..................... Passed 197.60 sec 1341/3626 Test #1496: shm_example_simple_lap_c_facto1_sched4_kway_svdend ...................... Passed 184.87 sec 1342/3626 Test #1504: shm_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpend ......... Passed 184.54 sec 1343/3626 Test #1405: shm_example_simple_lap_d_facto1_sched4_kway_pqrcpbegin .................. Passed 193.61 sec 1344/3626 Test #1428: shm_example_simple_lap_d_facto1_sched4_kway_pqrcpilu1 ................... Passed 186.43 sec 1345/3626 Test #1470: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpend .................... Passed 185.47 sec 1346/3626 Test #1365: shm_example_simple_lap_d_facto0_sched4_not_svdbegin .....................***Timeout 205.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.696050e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.037918e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.003054e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.719033e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.988003e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.863070e-03 s Time to initialize coeftab 8.074045e-02 s Time to factorize 2.618089e+00 s ( 1.93 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 6.319405e-01 s - iteration 1 : total iteration time 0.613 s error 2.3325e-14 Time for refinement 1.356832e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.332487e-14 max(|| b_i - A x_i ||_1) 3.473623e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.364906e-02 (SUCCESS) Start 1365: shm_example_simple_lap_d_facto0_sched4_not_svdbegin Start 1852: c_mpi_rep_example_step-by-step_lap_c_facto1 Start 1853: c_mpi_rep_example_step-by-step_lap_c_facto2 Start 1854: c_mpi_rep_example_step-by-step_lap_c_facto3 Start 1855: c_mpi_rep_example_step-by-step_lap_c_facto4 Start 1856: c_mpi_rep_example_step-by-step_lap_z_facto0 Start 1857: c_mpi_rep_example_step-by-step_lap_z_facto1 1346/3626 Test #1440: shm_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpend ......... Passed 205.18 sec 1347/3626 Test #1360: shm_example_simple_lap_s_facto2_sched4_kway_rqrrtend ....................***Timeout 219.10 sec Start 1360: shm_example_simple_lap_s_facto2_sched4_kway_rqrrtend 1347/3626 Test #1361: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtbegin .......***Timeout 219.10 sec Start 1361: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtbegin 1347/3626 Test #1362: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtend .........***Timeout 219.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1362: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtend 1347/3626 Test #1363: shm_example_simple_lap_s_facto2_sched4_kway_pqrcpilu0 ...................***Timeout 218.90 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.459001e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.788493e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.352658e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 9.508793e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.655979e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.778758e-03 s Time to initialize coeftab 6.612725e-01 s Time to factorize 1.401866e+00 s ( 7.12 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 7.444374e-01 s Time for refinement 3.519421e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.817545e-07 max(|| b_i - A x_i ||_1) 9.426255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.184470e+00 (SUCCESS) Start 1363: shm_example_simple_lap_s_facto2_sched4_kway_pqrcpilu0 1347/3626 Test #1366: shm_example_simple_lap_d_facto0_sched4_not_svdend .......................***Timeout 218.83 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 1366: shm_example_simple_lap_d_facto0_sched4_not_svdend 1347/3626 Test #1367: shm_example_simple_lap_d_facto0_sched4_kway_svdbegin ....................***Timeout 219.01 sec Start 1367: shm_example_simple_lap_d_facto0_sched4_kway_svdbegin 1347/3626 Test #1372: shm_example_simple_lap_d_facto0_sched4_not_pqrcpend .....................***Timeout 218.62 sec Start 1372: shm_example_simple_lap_d_facto0_sched4_not_pqrcpend 1347/3626 Test #1373: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpbegin ..................***Timeout 218.71 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.089242e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.900473e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.183213e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.293555e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.752572e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.394890e-02 s Time to initialize coeftab 1.325574e-01 s Time to factorize 1.838334e+00 s ( 2.75 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 7.139884e-01 s - iteration 1 : total iteration time 0.773 s error 1.1123e-14 Time for refinement 1.438321e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.112768e-14 max(|| b_i - A x_i ||_1) 2.181498e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.741240e-02 (SUCCESS) Start 1373: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpbegin 1347/3626 Test #1374: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpend ....................***Timeout 218.89 sec Start 1374: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpend 1347/3626 Test #1378: shm_example_simple_lap_d_facto0_sched4_not_rqrcpend .....................***Timeout 218.59 sec Start 1378: shm_example_simple_lap_d_facto0_sched4_not_rqrcpend 1347/3626 Test #1380: shm_example_simple_lap_d_facto0_sched4_kway_rqrcpend ....................***Timeout 218.66 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.194651e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.573256e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.581950e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.307510e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.530207e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.206308e-02 s Time to initialize coeftab 3.788622e-02 s Time to factorize 1.388426e+00 s ( 3.65 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 8.300798e-01 s Time for refinement 5.612551e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.656742e-16 max(|| b_i - A x_i ||_1) 1.929618e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.424730e-03 (SUCCESS) Start 1380: shm_example_simple_lap_d_facto0_sched4_kway_rqrcpend 1347/3626 Test #1381: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpbegin .......***Timeout 218.56 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 1381: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpbegin 1347/3626 Test #1385: shm_example_simple_lap_d_facto0_sched4_kway_tqrcpbegin ..................***Timeout 217.79 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 1385: shm_example_simple_lap_d_facto0_sched4_kway_tqrcpbegin 1347/3626 Test #1391: shm_example_simple_lap_d_facto0_sched4_kway_rqrrtbegin ..................***Timeout 216.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 1391: shm_example_simple_lap_d_facto0_sched4_kway_rqrrtbegin 1347/3626 Test #1392: shm_example_simple_lap_d_facto0_sched4_kway_rqrrtend ....................***Timeout 216.84 sec Start 1392: shm_example_simple_lap_d_facto0_sched4_kway_rqrrtend 1347/3626 Test #1394: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtend .........***Timeout 216.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.257999e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.874433e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.673374e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 3.744879e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.817093e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.577800e-02 s Time to initialize coeftab 4.499086e-02 s Time to factorize 1.278239e+00 s ( 3.96 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 6.508847e-01 s Time for refinement 3.882264e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.595314e-16 max(|| b_i - A x_i ||_1) 1.915262e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.406691e-03 (SUCCESS) Start 1394: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtend 1347/3626 Test #1395: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpilu0 ...................***Timeout 216.65 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 1395: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpilu0 1347/3626 Test #1397: shm_example_simple_lap_d_facto1_sched4_not_svdbegin .....................***Timeout 216.44 sec Start 1397: shm_example_simple_lap_d_facto1_sched4_not_svdbegin 1347/3626 Test #1398: shm_example_simple_lap_d_facto1_sched4_not_svdend .......................***Timeout 216.46 sec Start 1398: shm_example_simple_lap_d_facto1_sched4_not_svdend 1347/3626 Test #1400: shm_example_simple_lap_d_facto1_sched4_kway_svdend ......................***Timeout 216.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 1400: shm_example_simple_lap_d_facto1_sched4_kway_svdend 1347/3626 Test #1401: shm_example_simple_lap_d_facto1_sched4_kwayprojections_svdbegin .........***Timeout 215.85 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 1401: shm_example_simple_lap_d_facto1_sched4_kwayprojections_svdbegin Test #815: shm_example_simple_lap_s_facto1_sched1_not_rqrrtend ..................... Passed 209.70 sec 1348/3626 Test #1426: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtend ......... Passed 208.03 sec 1349/3626 Test #1435: shm_example_simple_lap_d_facto2_sched4_not_pqrcpbegin ................... Passed 207.92 sec 1350/3626 Test #1446: shm_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpend ......... Passed 207.74 sec 1351/3626 Test #1478: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpend ......... Passed 206.94 sec 1352/3626 Test #1493: shm_example_simple_lap_c_facto1_sched4_not_svdbegin ..................... Passed 206.65 sec 1353/3626 Test #1518: shm_example_simple_lap_c_facto1_sched4_not_rqrrtend ..................... Passed 203.64 sec Test #1043: shm_example_simple_lap_c_facto2_sched1_kway_pqrcpilu0 ...................***Timeout 208.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.183100e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.192855e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.297129e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 9.121200e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.343203e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.655720e-02 s Time to initialize coeftab 5.410147e-02 s Time to factorize 4.840916e+00 s ( 8.26 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.270102e+00 s Time for refinement 5.760091e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.043694e-07 max(|| b_i - A x_i ||_1) 1.111189e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.803861e+00 (SUCCESS) Start 1043: shm_example_simple_lap_c_facto2_sched1_kway_pqrcpilu0 1354/3626 Test #1413: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpbegin .......***Timeout 208.47 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.516183e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.501145e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.289009e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.129660e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.916851e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.557816e-01 s Time to initialize coeftab 8.566328e-01 s Time to factorize 6.050134e+00 s (885.78 KFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 7.731035e-01 s - iteration 1 : total iteration time 0.521 s error 3.083e-14 Time for refinement 1.242779e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.083600e-14 max(|| b_i - A x_i ||_1) 5.007562e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.292432e-02 (SUCCESS) Start 1413: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpbegin 1354/3626 Test #1414: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpend .........***Timeout 208.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.766695e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.030954e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.177674e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.322918e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.263844e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.599516e-01 s Time to initialize coeftab 7.251253e-02 s Time to factorize 1.908147e+00 s ( 2.74 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 7.512129e-01 s Time for refinement 4.539821e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.471455e-16 max(|| b_i - A x_i ||_1) 1.839253e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.311179e-03 (SUCCESS) Start 1414: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpend 1354/3626 Test #1416: shm_example_simple_lap_d_facto1_sched4_not_tqrcpend .....................***Timeout 208.58 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.182127e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.298896e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.275780e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.639795e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.102885e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.368422e-03 s Time to initialize coeftab 5.905757e-02 s Time to factorize 1.966647e+00 s ( 2.66 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 7.505530e-01 s Time for refinement 4.181314e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.584177e-16 max(|| b_i - A x_i ||_1) 1.851874e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.327039e-03 (SUCCESS) Start 1416: shm_example_simple_lap_d_facto1_sched4_not_tqrcpend 1354/3626 Test #1417: shm_example_simple_lap_d_facto1_sched4_kway_tqrcpbegin ..................***Timeout 208.60 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.399235e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.272779e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.591539e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.089084e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.527195e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.444805e-03 s Time to initialize coeftab 8.166537e-01 s Time to factorize 5.615532e+00 s (954.33 KFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 6.369743e-01 s - iteration 1 : total iteration time 0.647 s error 7.8974e-14 Time for refinement 1.232982e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.897202e-14 max(|| b_i - A x_i ||_1) 1.437228e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.806000e-01 (SUCCESS) Start 1417: shm_example_simple_lap_d_facto1_sched4_kway_tqrcpbegin 1354/3626 Test #1418: shm_example_simple_lap_d_facto1_sched4_kway_tqrcpend ....................***Timeout 208.63 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.508077e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.725889e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.393591e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.467085e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.672443e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.080167e-01 s Time to initialize coeftab 4.449475e-02 s Time to factorize 1.752002e+00 s ( 2.99 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 9.723620e-01 s Time for refinement 4.786833e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.501029e-16 max(|| b_i - A x_i ||_1) 1.847779e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.321892e-03 (SUCCESS) Start 1418: shm_example_simple_lap_d_facto1_sched4_kway_tqrcpend 1354/3626 Test #1419: shm_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpbegin .......***Timeout 208.65 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.350491e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.805178e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.800197e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.061583e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.276479e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.713985e-02 s Time to initialize coeftab 3.434357e-01 s Time to factorize 3.674784e+00 s ( 1.42 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 7.478084e-01 s - iteration 1 : total iteration time 1.11 s error 2.3603e-14 Time for refinement 2.071468e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.360082e-14 max(|| b_i - A x_i ||_1) 4.789002e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.017792e-02 (SUCCESS) Start 1419: shm_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpbegin 1354/3626 Test #1420: shm_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpend .........***Timeout 208.73 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.811939e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.857076e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.746057e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.053604e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.806677e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.466672e-01 s Time to initialize coeftab 7.711266e-02 s Time to factorize 1.484954e+00 s ( 3.52 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 5.711396e-01 s Time for refinement 4.018867e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.553888e-16 max(|| b_i - A x_i ||_1) 1.858247e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.335047e-03 (SUCCESS) Start 1420: shm_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpend 1354/3626 Test #1421: shm_example_simple_lap_d_facto1_sched4_not_rqrrtbegin ...................***Timeout 208.78 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.660983e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.283609e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.124738e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.558304e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.816837e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.068625e-01 s Time to initialize coeftab 4.677776e-01 s Time to factorize 4.365625e+00 s ( 1.20 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 8.790481e-01 s - iteration 1 : total iteration time 0.573 s error 5.77e-13 Time for refinement 1.374675e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770000e-13 max(|| b_i - A x_i ||_1) 1.125465e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.414243e+00 (SUCCESS) Start 1421: shm_example_simple_lap_d_facto1_sched4_not_rqrrtbegin 1354/3626 Test #1423: shm_example_simple_lap_d_facto1_sched4_kway_rqrrtbegin ..................***Timeout 208.83 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.519698e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.091602e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.140255e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.042171e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.497114e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.676875e-01 s Time to initialize coeftab 8.545133e-01 s Time to factorize 4.241617e+00 s ( 1.23 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 8.233004e-01 s - iteration 1 : total iteration time 0.503 s error 4.9763e-13 Time for refinement 1.075239e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.976282e-13 max(|| b_i - A x_i ||_1) 7.514287e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.442347e-01 (SUCCESS) Start 1423: shm_example_simple_lap_d_facto1_sched4_kway_rqrrtbegin 1354/3626 Test #1424: shm_example_simple_lap_d_facto1_sched4_kway_rqrrtend ....................***Timeout 208.89 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.455736e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.579946e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.222354e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.211196e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.666740e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.199942e-02 s Time to initialize coeftab 2.821485e-02 s Time to factorize 1.477332e+00 s ( 3.54 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 8.446607e-01 s Time for refinement 4.218158e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.552113e-16 max(|| b_i - A x_i ||_1) 1.852180e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.327424e-03 (SUCCESS) Start 1424: shm_example_simple_lap_d_facto1_sched4_kway_rqrrtend 1354/3626 Test #1425: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtbegin .......***Timeout 208.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.501564e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.084428e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.598830e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.716217e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.861502e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.002981e-02 s Time to initialize coeftab 5.266323e-01 s Time to factorize 3.557657e+00 s ( 1.47 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 8.839664e-01 s - iteration 1 : total iteration time 0.472 s error 3.7855e-13 Time for refinement 1.199125e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.785462e-13 max(|| b_i - A x_i ||_1) 7.195523e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.041792e-01 (SUCCESS) Start 1425: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtbegin 1354/3626 Test #1427: shm_example_simple_lap_d_facto1_sched4_kway_pqrcpilu0 ...................***Timeout 208.92 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.841054e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.290022e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.613606e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.707896e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.233145e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.228706e-03 s Time to initialize coeftab 3.819931e-02 s Time to factorize 1.575834e+00 s ( 3.32 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 8.101509e-01 s - iteration 1 : total iteration time 0.636 s error 9.2179e-15 Time for refinement 1.358024e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.221536e-15 max(|| b_i - A x_i ||_1) 7.462849e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.377711e-03 (SUCCESS) Start 1427: shm_example_simple_lap_d_facto1_sched4_kway_pqrcpilu0 1354/3626 Test #1429: shm_example_simple_lap_d_facto2_sched4_not_svdbegin .....................***Timeout 208.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.697209e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.016940e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.819152e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.371423e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.948800e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.256859e-02 s Time to initialize coeftab 1.074337e+00 s Time to factorize 1.011505e+01 s (1010.79 KFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 5.256673e-01 s - iteration 1 : total iteration time 0.364 s error 1.7001e-14 Time for refinement 9.402465e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.700187e-14 max(|| b_i - A x_i ||_1) 3.503999e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.403076e-02 (SUCCESS) Start 1429: shm_example_simple_lap_d_facto2_sched4_not_svdbegin 1354/3626 Test #1430: shm_example_simple_lap_d_facto2_sched4_not_svdend .......................***Timeout 209.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.087742e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.387177e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.581504e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.283893e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.341692e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.837449e-03 s Time to initialize coeftab 7.118592e-02 s Time to factorize 3.124680e+00 s ( 3.20 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 9.128988e-01 s Time for refinement 3.265245e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.651974e-16 max(|| b_i - A x_i ||_1) 1.794992e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.255561e-03 (SUCCESS) Start 1430: shm_example_simple_lap_d_facto2_sched4_not_svdend 1354/3626 Test #1431: shm_example_simple_lap_d_facto2_sched4_kway_svdbegin ....................***Timeout 209.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.640671e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.497955e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.101121e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.361780e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.927663e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.629215e-02 s Time to initialize coeftab 6.255187e-01 s Time to factorize 1.048972e+01 s (974.69 KFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 5.406427e-01 s - iteration 1 : total iteration time 0.236 s error 2.0926e-14 Time for refinement 7.859251e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.092777e-14 max(|| b_i - A x_i ||_1) 3.918972e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.924525e-02 (SUCCESS) Start 1431: shm_example_simple_lap_d_facto2_sched4_kway_svdbegin 1354/3626 Test #1432: shm_example_simple_lap_d_facto2_sched4_kway_svdend ......................***Timeout 209.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.392333e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.960439e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.508526e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.274530e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.919366e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.544772e-01 s Time to initialize coeftab 1.241097e-01 s Time to factorize 2.633031e+00 s ( 3.79 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 8.810989e-01 s Time for refinement 5.952953e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.620456e-16 max(|| b_i - A x_i ||_1) 1.800720e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.262759e-03 (SUCCESS) Start 1432: shm_example_simple_lap_d_facto2_sched4_kway_svdend 1354/3626 Test #1433: shm_example_simple_lap_d_facto2_sched4_kwayprojections_svdbegin .........***Timeout 209.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.659719e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.639804e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.682794e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.045133e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.774960e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.024120e-01 s Time to initialize coeftab 6.671195e-01 s Time to factorize 1.254927e+01 s (814.73 KFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 7.888930e-01 s - iteration 1 : total iteration time 0.647 s error 2.3854e-14 Time for refinement 1.238167e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.385439e-14 max(|| b_i - A x_i ||_1) 3.869894e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.862854e-02 (SUCCESS) Start 1433: shm_example_simple_lap_d_facto2_sched4_kwayprojections_svdbegin 1354/3626 Test #1434: shm_example_simple_lap_d_facto2_sched4_kwayprojections_svdend ...........***Timeout 209.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.902024e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.981518e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.531223e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.594084e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.283326e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.212515e-03 s Time to initialize coeftab 2.674014e-02 s Time to factorize 3.570145e+00 s ( 2.80 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 6.988287e-01 s Time for refinement 6.034669e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.665854e-16 max(|| b_i - A x_i ||_1) 1.817440e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.283770e-03 (SUCCESS) Start 1434: shm_example_simple_lap_d_facto2_sched4_kwayprojections_svdend 1354/3626 Test #1436: shm_example_simple_lap_d_facto2_sched4_not_pqrcpend .....................***Timeout 209.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.703033e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.889125e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.472343e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 5.546537e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.603141e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.524124e-01 s Time to initialize coeftab 1.466389e-01 s Time to factorize 1.538691e+00 s ( 6.49 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.312176e+00 s Time for refinement 8.455015e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.696187e-16 max(|| b_i - A x_i ||_1) 1.826700e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.295406e-03 (SUCCESS) Start 1436: shm_example_simple_lap_d_facto2_sched4_not_pqrcpend 1354/3626 Test #1437: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpbegin ..................***Timeout 209.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.061305e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.694776e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.833368e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.738020e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.810642e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.050117e-02 s Time to initialize coeftab 1.563104e-01 s Time to factorize 2.547120e+00 s ( 3.92 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 4.427910e-01 s - iteration 1 : total iteration time 0.427 s error 8.8544e-15 Time for refinement 7.957431e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.843448e-15 max(|| b_i - A x_i ||_1) 1.516303e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.905365e-02 (SUCCESS) Start 1437: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpbegin 1354/3626 Test #1438: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpend ....................***Timeout 209.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.276913e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.215920e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.250650e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.739121e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.551871e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.153581e-02 s Time to initialize coeftab 3.221737e-02 s Time to factorize 1.540195e+00 s ( 6.48 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 9.000317e-01 s Time for refinement 5.541854e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.674748e-16 max(|| b_i - A x_i ||_1) 1.824902e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.293146e-03 (SUCCESS) Start 1438: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpend 1354/3626 Test #1439: shm_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpbegin .......***Timeout 209.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.103947e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.465144e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.525258e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.556057e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.997193e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.792877e-03 s Time to initialize coeftab 1.720139e-01 s Time to factorize 4.202649e+00 s ( 2.38 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 8.058741e-01 s - iteration 1 : total iteration time 0.781 s error 9.5258e-15 Time for refinement 1.823299e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.534192e-15 max(|| b_i - A x_i ||_1) 1.533892e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.927468e-02 (SUCCESS) Start 1439: shm_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpbegin 1354/3626 Test #1441: shm_example_simple_lap_d_facto2_sched4_not_rqrcpbegin ...................***Timeout 209.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.859836e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.542397e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.481191e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.050444e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.130255e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.194565e-03 s Time to initialize coeftab 3.045662e-01 s Time to factorize 5.286528e+00 s ( 1.89 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 763 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.003143e+00 s - iteration 1 : total iteration time 0.568 s error 3.2494e-14 Time for refinement 1.540590e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.248483e-14 max(|| b_i - A x_i ||_1) 6.246381e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.849115e-02 (SUCCESS) Start 1441: shm_example_simple_lap_d_facto2_sched4_not_rqrcpbegin 1354/3626 Test #1442: shm_example_simple_lap_d_facto2_sched4_not_rqrcpend .....................***Timeout 209.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.613691e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.844081e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.990737e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 5.572866e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.541440e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.036642e-01 s Time to initialize coeftab 6.594366e-02 s Time to factorize 2.479363e+00 s ( 4.03 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 7.821671e-01 s Time for refinement 3.358146e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.640137e-16 max(|| b_i - A x_i ||_1) 1.786170e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.244476e-03 (SUCCESS) Start 1442: shm_example_simple_lap_d_facto2_sched4_not_rqrcpend 1354/3626 Test #1444: shm_example_simple_lap_d_facto2_sched4_kway_rqrcpend ....................***Timeout 209.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.802482e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.052433e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.260687e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.158843e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.962816e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.397470e-01 s Time to initialize coeftab 1.103784e-01 s Time to factorize 2.681349e+00 s ( 3.72 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 7.425534e-01 s Time for refinement 3.478005e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.616732e-16 max(|| b_i - A x_i ||_1) 1.786569e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.244977e-03 (SUCCESS) Start 1444: shm_example_simple_lap_d_facto2_sched4_kway_rqrcpend 1354/3626 Test #1447: shm_example_simple_lap_d_facto2_sched4_not_tqrcpbegin ...................***Timeout 209.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.785560e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.394650e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.163447e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.196320e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.323657e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.084901e-03 s Time to initialize coeftab 4.584228e-01 s Time to factorize 4.246193e+00 s ( 2.35 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 763 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 9.640869e-01 s - iteration 1 : total iteration time 1.06 s error 2.542e-14 Time for refinement 1.725671e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.542138e-14 max(|| b_i - A x_i ||_1) 3.641064e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.575309e-02 (SUCCESS) Start 1447: shm_example_simple_lap_d_facto2_sched4_not_tqrcpbegin 1354/3626 Test #1448: shm_example_simple_lap_d_facto2_sched4_not_tqrcpend .....................***Timeout 209.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.291597e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.155333e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.051473e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.889023e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.955732e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.212173e-02 s Time to initialize coeftab 7.390427e-02 s Time to factorize 1.699649e+00 s ( 5.87 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 9.448285e-01 s Time for refinement 5.063859e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.614504e-16 max(|| b_i - A x_i ||_1) 1.800245e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.262162e-03 (SUCCESS) Start 1448: shm_example_simple_lap_d_facto2_sched4_not_tqrcpend 1354/3626 Test #1451: shm_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpbegin .......***Timeout 209.75 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.193867e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.713526e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.931619e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.444953e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.773048e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.373543e-01 s Time to initialize coeftab 1.195948e+00 s Time to factorize 8.860496e+00 s ( 1.13 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 5.438359e-01 s - iteration 1 : total iteration time 0.241 s error 3.1656e-14 Time for refinement 6.250257e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.165850e-14 max(|| b_i - A x_i ||_1) 4.851503e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.096331e-02 (SUCCESS) Start 1451: shm_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpbegin 1354/3626 Test #1453: shm_example_simple_lap_d_facto2_sched4_not_rqrrtbegin ...................***Timeout 209.79 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.030721e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.125089e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.879094e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.690530e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.940956e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.902494e-02 s Time to initialize coeftab 7.066940e-01 s Time to factorize 6.523640e+00 s ( 1.53 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 8.867815e-01 s - iteration 1 : total iteration time 0.46 s error 1.4229e-13 Time for refinement 1.277043e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.422974e-13 max(|| b_i - A x_i ||_1) 2.414258e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.033722e-01 (SUCCESS) Start 1453: shm_example_simple_lap_d_facto2_sched4_not_rqrrtbegin 1354/3626 Test #1455: shm_example_simple_lap_d_facto2_sched4_kway_rqrrtbegin ..................***Timeout 209.83 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.954436e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.273109e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.083323e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.415268e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.254730e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.925978e-02 s Time to initialize coeftab 8.820104e-01 s Time to factorize 4.015716e+00 s ( 2.49 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 6.598137e-01 s - iteration 1 : total iteration time 0.252 s error 6.689e-13 Time for refinement 8.032467e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.688989e-13 max(|| b_i - A x_i ||_1) 7.289115e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.159400e-01 (SUCCESS) Start 1455: shm_example_simple_lap_d_facto2_sched4_kway_rqrrtbegin 1354/3626 Test #1456: shm_example_simple_lap_d_facto2_sched4_kway_rqrrtend ....................***Timeout 209.87 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.917286e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.207442e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.566999e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.911992e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.339150e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.205787e-01 s Time to initialize coeftab 3.230104e-02 s Time to factorize 2.208159e+00 s ( 4.52 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 7.217782e-01 s Time for refinement 5.219937e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.625257e-16 max(|| b_i - A x_i ||_1) 1.799827e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.261637e-03 (SUCCESS) Start 1456: shm_example_simple_lap_d_facto2_sched4_kway_rqrrtend 1354/3626 Test #1457: shm_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtbegin .......***Timeout 209.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.701208e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.650122e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.189003e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 9.786642e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.935763e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.160052e-03 s Time to initialize coeftab 7.877167e-01 s Time to factorize 5.678663e+00 s ( 1.76 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 8.903431e-01 s - iteration 1 : total iteration time 0.641 s error 4.8948e-13 Time for refinement 1.220151e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.894871e-13 max(|| b_i - A x_i ||_1) 7.820262e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.826831e-01 (SUCCESS) Start 1457: shm_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtbegin 1354/3626 Test #1458: shm_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtend .........***Timeout 209.92 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.218020e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.291140e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.784192e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 5.292949e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.237746e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.067929e-01 s Time to initialize coeftab 1.383337e-02 s Time to factorize 2.012370e+00 s ( 4.96 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 5.341867e-01 s Time for refinement 1.225350e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.703526e-16 max(|| b_i - A x_i ||_1) 1.812326e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.277343e-03 (SUCCESS) Start 1458: shm_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtend 1354/3626 Test #1459: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpilu0 ...................***Timeout 209.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.051423e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.490882e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.864983e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.283896e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.375048e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.826942e-03 s Time to initialize coeftab 2.855003e-01 s Time to factorize 2.133352e+00 s ( 4.68 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 8.014845e-01 s - iteration 1 : total iteration time 0.47 s error 5.5593e-15 Time for refinement 1.029621e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.563127e-15 max(|| b_i - A x_i ||_1) 6.251049e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.854980e-03 (SUCCESS) Start 1459: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpilu0 1354/3626 Test #1460: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpilu1 ...................***Timeout 209.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.759177e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.682266e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.528920e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.429520e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.983677e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.378796e-01 s Time to initialize coeftab 3.472842e-01 s Time to factorize 1.779400e+00 s ( 5.61 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 6.467139e-01 s - iteration 1 : total iteration time 0.728 s error 2.3229e-15 Time for refinement 1.493723e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.329925e-15 max(|| b_i - A x_i ||_1) 1.347377e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.693095e-03 (SUCCESS) Start 1460: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpilu1 1354/3626 Test #1463: shm_example_simple_lap_c_facto0_sched4_kway_svdbegin ....................***Timeout 209.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.484399e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.085820e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.213362e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.510980e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.060666e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.530419e-03 s Time to initialize coeftab 6.690803e-01 s Time to factorize 1.250430e+01 s ( 1.62 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 8.955652e-01 s Time for refinement 7.303360e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.072355e-07 max(|| b_i - A x_i ||_1) 9.566194e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.413836e+00 (SUCCESS) Start 1463: shm_example_simple_lap_c_facto0_sched4_kway_svdbegin 1354/3626 Test #1464: shm_example_simple_lap_c_facto0_sched4_kway_svdend ......................***Timeout 209.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.928039e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.998996e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.347419e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.259567e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.237270e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.194071e-02 s Time to initialize coeftab 2.583021e-02 s Time to factorize 6.021566e+00 s ( 3.37 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.083395e+00 s Time for refinement 6.177847e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.030431e-07 max(|| b_i - A x_i ||_1) 9.053705e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.284520e+00 (SUCCESS) Start 1464: shm_example_simple_lap_c_facto0_sched4_kway_svdend 1354/3626 Test #1468: shm_example_simple_lap_c_facto0_sched4_not_pqrcpend .....................***Timeout 210.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.476469e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.213861e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.255210e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 3.883825e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.217833e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.347973e-03 s Time to initialize coeftab 2.762975e-02 s Time to factorize 2.352858e+00 s ( 8.62 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 6.831387e-01 s Time for refinement 3.000853e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.039034e-07 max(|| b_i - A x_i ||_1) 9.087953e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.293162e+00 (SUCCESS) Start 1468: shm_example_simple_lap_c_facto0_sched4_not_pqrcpend 1354/3626 Test #1469: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpbegin ..................***Timeout 210.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.984908e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.099723e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.188992e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.452687e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.270568e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.689290e-02 s Time to initialize coeftab 2.162535e-01 s Time to factorize 8.505241e+00 s ( 2.38 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 9.662294e-01 s - iteration 1 : total iteration time 0.767 s error 2.0093e-11 Time for refinement 1.701637e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.513346e-08 max(|| b_i - A x_i ||_1) 3.174387e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.009930e-01 (SUCCESS) Start 1469: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpbegin 1354/3626 Test #1471: shm_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpbegin .......***Timeout 210.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.138001e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.138047e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.929298e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.609792e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.994929e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.499087e-02 s Time to initialize coeftab 2.351605e-01 s Time to factorize 6.596918e+00 s ( 3.07 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 9.453226e-01 s Time for refinement 3.726511e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.276807e-07 max(|| b_i - A x_i ||_1) 1.380111e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.482432e+00 (SUCCESS) Start 1471: shm_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpbegin 1354/3626 Test #1472: shm_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpend .........***Timeout 210.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.957418e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.916178e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.767496e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.501214e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.843701e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.366326e-02 s Time to initialize coeftab 9.232291e-02 s Time to factorize 3.493125e+00 s ( 5.81 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.084498e+00 s Time for refinement 4.630210e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.030988e-07 max(|| b_i - A x_i ||_1) 9.057851e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.285566e+00 (SUCCESS) Start 1472: shm_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpend 1354/3626 Test #1473: shm_example_simple_lap_c_facto0_sched4_not_rqrcpbegin ...................***Timeout 210.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1473: shm_example_simple_lap_c_facto0_sched4_not_rqrcpbegin 1354/3626 Test #1474: shm_example_simple_lap_c_facto0_sched4_not_rqrcpend .....................***Timeout 210.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.156446e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.180893e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.183519e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.430619e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.391087e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.209253e-02 s Time to initialize coeftab 3.831119e-02 s Time to factorize 6.142881e+00 s ( 3.30 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 9.295201e-01 s Time for refinement 6.328896e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.028890e-07 max(|| b_i - A x_i ||_1) 9.143400e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.307153e+00 (SUCCESS) Start 1474: shm_example_simple_lap_c_facto0_sched4_not_rqrcpend 1354/3626 Test #1475: shm_example_simple_lap_c_facto0_sched4_kway_rqrcpbegin ..................***Timeout 210.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.398669e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.660204e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.742070e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.666918e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.860636e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.619752e-03 s Time to initialize coeftab 5.863454e-01 s Time to factorize 1.072940e+01 s ( 1.89 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.875171e+00 s - iteration 1 : total iteration time 0.56 s error 4.9269e-11 Time for refinement 1.210151e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.350399e-08 max(|| b_i - A x_i ||_1) 3.228154e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.145597e-01 (SUCCESS) Start 1475: shm_example_simple_lap_c_facto0_sched4_kway_rqrcpbegin 1354/3626 Test #1476: shm_example_simple_lap_c_facto0_sched4_kway_rqrcpend ....................***Timeout 210.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.378675e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.999217e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.470129e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.499686e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.275342e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.077380e-01 s Time to initialize coeftab 1.071116e-01 s Time to factorize 4.404409e+00 s ( 4.60 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 8.710672e-01 s Time for refinement 2.056258e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.035332e-07 max(|| b_i - A x_i ||_1) 9.090055e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.293692e+00 (SUCCESS) Start 1476: shm_example_simple_lap_c_facto0_sched4_kway_rqrcpend 1354/3626 Test #1477: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpbegin .......***Timeout 210.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.377306e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.500506e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.516674e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 5.179008e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.202704e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.522766e-01 s Time to initialize coeftab 6.913463e-01 s Time to factorize 1.624062e+01 s ( 1.25 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 5.793890e-01 s Time for refinement 6.865424e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.770495e-07 max(|| b_i - A x_i ||_1) 1.012350e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.554462e+00 (SUCCESS) Start 1477: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpbegin 1354/3626 Test #1479: shm_example_simple_lap_c_facto0_sched4_not_tqrcpbegin ...................***Timeout 210.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.474159e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.163538e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.166677e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 5.078723e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.378130e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.627042e-02 s Time to initialize coeftab 8.639803e-01 s Time to factorize 9.669808e+00 s ( 2.10 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 6.942688e-01 s Time for refinement 4.382264e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.443115e-07 max(|| b_i - A x_i ||_1) 1.243586e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.137938e+00 (SUCCESS) Start 1479: shm_example_simple_lap_c_facto0_sched4_not_tqrcpbegin 1354/3626 Test #1480: shm_example_simple_lap_c_facto0_sched4_not_tqrcpend .....................***Timeout 210.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.380806e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.892543e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.983358e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.754370e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.433410e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.003466e-01 s Time to initialize coeftab 4.868139e-02 s Time to factorize 2.817338e+00 s ( 7.20 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 4.763993e-01 s Time for refinement 4.649455e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.076872e-07 max(|| b_i - A x_i ||_1) 9.190947e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.319150e+00 (SUCCESS) Start 1480: shm_example_simple_lap_c_facto0_sched4_not_tqrcpend 1354/3626 Test #1482: shm_example_simple_lap_c_facto0_sched4_kway_tqrcpend ....................***Timeout 210.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.632266e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.458044e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.078014e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.036676e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.098655e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.425734e-03 s Time to initialize coeftab 6.987911e-02 s Time to factorize 4.858330e+00 s ( 4.17 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 7.269840e-01 s Time for refinement 6.116285e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.005163e-07 max(|| b_i - A x_i ||_1) 9.010933e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.273728e+00 (SUCCESS) Start 1482: shm_example_simple_lap_c_facto0_sched4_kway_tqrcpend 1354/3626 Test #1483: shm_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpbegin .......***Timeout 210.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.440681e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.671502e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.212232e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.451283e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.250697e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.081485e-01 s Time to initialize coeftab 7.396041e-01 s Time to factorize 1.580215e+01 s ( 1.28 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 4.901431e-01 s - iteration 1 : total iteration time 0.688 s error 4.8747e-11 Time for refinement 1.206338e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.321689e-08 max(|| b_i - A x_i ||_1) 3.255247e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.213961e-01 (SUCCESS) Start 1483: shm_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpbegin 1354/3626 Test #1484: shm_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpend .........***Timeout 210.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.488856e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.007438e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.532233e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.121828e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.866706e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.898945e-03 s Time to initialize coeftab 2.778761e-02 s Time to factorize 3.411971e+00 s ( 5.94 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.034684e+00 s Time for refinement 4.537810e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.054447e-07 max(|| b_i - A x_i ||_1) 9.153211e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.309628e+00 (SUCCESS) Start 1484: shm_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpend 1354/3626 Test #1485: shm_example_simple_lap_c_facto0_sched4_not_rqrrtbegin ...................***Timeout 210.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.764880e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.520656e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.235124e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.139444e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.017040e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.258488e-03 s Time to initialize coeftab 1.122804e+00 s Time to factorize 7.795862e+00 s ( 2.60 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 6.278892e-01 s - iteration 1 : total iteration time 0.793 s error 5.0171e-11 Time for refinement 1.313429e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.570730e-08 max(|| b_i - A x_i ||_1) 3.302265e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.332602e-01 (SUCCESS) Start 1485: shm_example_simple_lap_c_facto0_sched4_not_rqrrtbegin 1354/3626 Test #1487: shm_example_simple_lap_c_facto0_sched4_kway_rqrrtbegin ..................***Timeout 210.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.501962e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.292864e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.641973e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.110975e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.341290e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.205769e-02 s Time to initialize coeftab 6.042200e-01 s Time to factorize 1.130758e+01 s ( 1.79 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 7.642133e-01 s - iteration 1 : total iteration time 0.525 s error 1.9359e-11 Time for refinement 1.217861e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.590592e-08 max(|| b_i - A x_i ||_1) 3.298565e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.323265e-01 (SUCCESS) Start 1487: shm_example_simple_lap_c_facto0_sched4_kway_rqrrtbegin 1354/3626 Test #1489: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtbegin .......***Timeout 210.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.857836e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.078870e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.462365e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 7.307908e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.199031e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.389315e-01 s Time to initialize coeftab 1.314149e+00 s Time to factorize 1.050632e+01 s ( 1.93 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 6.822266e-01 s Time for refinement 4.107221e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.592052e-07 max(|| b_i - A x_i ||_1) 1.406633e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.549355e+00 (SUCCESS) Start 1489: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtbegin Test #1049: shm_example_simple_lap_c_facto3_sched1_kwayprojections_svdbegin .........***Timeout 210.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.043122e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.806939e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.863688e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.956430e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.432420e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.880053e-03 s Time to initialize coeftab 7.317526e-02 s Time to factorize 1.664630e+01 s ( 1.22 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 8.778514e-01 s Time for refinement 4.385530e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.173328e-07 max(|| b_i - A x_i ||_1) 9.823897e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.478863e+00 (SUCCESS) Start 1049: shm_example_simple_lap_c_facto3_sched1_kwayprojections_svdbegin 1354/3626 Test #1490: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtend .........***Timeout 210.32 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.604011e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.328336e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.599293e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 6.190224e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.740933e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.291327e-01 s Time to initialize coeftab 5.552594e-02 s Time to factorize 4.315724e+00 s ( 4.70 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 9.090058e-01 s Time for refinement 5.629013e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.030166e-07 max(|| b_i - A x_i ||_1) 9.028802e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.278236e+00 (SUCCESS) Start 1490: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtend 1354/3626 Test #1491: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpilu0 ...................***Timeout 210.35 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.873988e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.477362e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.720708e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 5.226856e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.824598e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.499097e-01 s Time to initialize coeftab 8.950437e-02 s Time to factorize 4.455718e+00 s ( 4.55 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.346133e+00 s Time for refinement 4.138870e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.725087e-07 max(|| b_i - A x_i ||_1) 9.893532e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.496434e+00 (SUCCESS) Start 1491: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpilu0 1354/3626 Test #1492: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpilu1 ...................***Timeout 210.42 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.004513e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.778597e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.227305e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.224283e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.344695e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.494144e-01 s Time to initialize coeftab 5.856220e-01 s Time to factorize 2.868778e+00 s ( 7.07 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 5.562094e-01 s Time for refinement 2.683952e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.077245e-07 max(|| b_i - A x_i ||_1) 9.218282e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.326048e+00 (SUCCESS) Start 1492: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpilu1 1354/3626 Test #1495: shm_example_simple_lap_c_facto1_sched4_kway_svdbegin ....................***Timeout 210.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1495: shm_example_simple_lap_c_facto1_sched4_kway_svdbegin 1354/3626 Test #1497: shm_example_simple_lap_c_facto1_sched4_kwayprojections_svdbegin .........***Timeout 210.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1497: shm_example_simple_lap_c_facto1_sched4_kwayprojections_svdbegin 1354/3626 Test #1500: shm_example_simple_lap_c_facto1_sched4_not_pqrcpend .....................***Timeout 210.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.502634e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.252327e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.047095e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.691730e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.246949e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.108525e-01 s Time to initialize coeftab 4.825545e-02 s Time to factorize 3.496689e+00 s ( 6.09 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 9.282183e-01 s Time for refinement 3.449442e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.061487e-07 max(|| b_i - A x_i ||_1) 8.806483e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.222139e+00 (SUCCESS) Start 1500: shm_example_simple_lap_c_facto1_sched4_not_pqrcpend 1354/3626 Test #1501: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpbegin ..................***Timeout 210.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.001658e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.031614e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.519596e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.767414e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.180438e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.141338e-01 s Time to initialize coeftab 4.296253e-01 s Time to factorize 6.234481e+00 s ( 3.42 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 7.053891e-01 s Time for refinement 3.957227e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.883397e-07 max(|| b_i - A x_i ||_1) 1.151207e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.904840e+00 (SUCCESS) Start 1501: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpbegin 1354/3626 Test #1502: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpend ....................***Timeout 210.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.693811e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.155888e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.677243e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.360322e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.403575e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.674494e-02 s Time to initialize coeftab 1.345218e-01 s Time to factorize 3.584634e+00 s ( 5.94 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 7.254228e-01 s Time for refinement 3.463526e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.077674e-07 max(|| b_i - A x_i ||_1) 8.832589e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.228726e+00 (SUCCESS) Start 1502: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpend 1354/3626 Test #1503: shm_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpbegin .......***Timeout 210.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.789422e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.817156e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.567003e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.705502e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.829541e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.123972e-03 s Time to initialize coeftab 4.239060e-01 s Time to factorize 9.829526e+00 s ( 2.17 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 9.048859e-01 s - iteration 1 : total iteration time 0.744 s error 5.1097e-11 Time for refinement 1.485484e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.688231e-08 max(|| b_i - A x_i ||_1) 3.358458e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.474393e-01 (SUCCESS) Start 1503: shm_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpbegin 1354/3626 Test #1506: shm_example_simple_lap_c_facto1_sched4_not_rqrcpend .....................***Timeout 210.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.140470e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.757379e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.062655e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.972215e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.357947e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.072054e-01 s Time to initialize coeftab 1.023110e-01 s Time to factorize 4.742554e+00 s ( 4.49 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 9.428758e-01 s Time for refinement 5.056859e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.070325e-07 max(|| b_i - A x_i ||_1) 8.784929e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.216700e+00 (SUCCESS) Start 1506: shm_example_simple_lap_c_facto1_sched4_not_rqrcpend 1354/3626 Test #1510: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpend .........***Timeout 210.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.064999e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.719085e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.676996e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.809297e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.026166e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.562502e-02 s Time to initialize coeftab 3.563862e-02 s Time to factorize 3.354104e+00 s ( 6.35 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 9.127983e-01 s Time for refinement 7.574159e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.077364e-07 max(|| b_i - A x_i ||_1) 8.737420e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.204712e+00 (SUCCESS) Start 1510: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpend 1354/3626 Test #1511: shm_example_simple_lap_c_facto1_sched4_not_tqrcpbegin ...................***Timeout 210.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.972798e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.708861e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.264130e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.354291e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.640110e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.335199e-01 s Time to initialize coeftab 1.675239e+00 s Time to factorize 1.009747e+01 s ( 2.11 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.122586e+00 s Time for refinement 1.652606e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.896428e-07 max(|| b_i - A x_i ||_1) 1.169109e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.950011e+00 (SUCCESS) Start 1511: shm_example_simple_lap_c_facto1_sched4_not_tqrcpbegin 1354/3626 Test #1512: shm_example_simple_lap_c_facto1_sched4_not_tqrcpend .....................***Timeout 210.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.658365e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.802520e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.004275e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.776461e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.816121e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.187848e-02 s Time to initialize coeftab 3.619257e-02 s Time to factorize 4.048350e+00 s ( 5.26 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 6.774841e-01 s Time for refinement 4.318373e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.080101e-07 max(|| b_i - A x_i ||_1) 8.964636e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.262045e+00 (SUCCESS) Start 1512: shm_example_simple_lap_c_facto1_sched4_not_tqrcpend 1354/3626 Test #1513: shm_example_simple_lap_c_facto1_sched4_kway_tqrcpbegin ..................***Timeout 209.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.803218e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.258003e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.046050e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.525828e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.212458e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.074190e-01 s Time to initialize coeftab 1.450945e+00 s Time to factorize 1.191251e+01 s ( 1.79 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 5.249298e-01 s Time for refinement 1.616727e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.760712e-07 max(|| b_i - A x_i ||_1) 1.266916e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.196808e+00 (SUCCESS) Start 1513: shm_example_simple_lap_c_facto1_sched4_kway_tqrcpbegin 1354/3626 Test #1514: shm_example_simple_lap_c_facto1_sched4_kway_tqrcpend ....................***Timeout 208.74 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.379412e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.293231e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.136686e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.859774e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.590799e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.505816e-03 s Time to initialize coeftab 2.500436e-02 s Time to factorize 4.936385e+00 s ( 4.32 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 9.516401e-01 s Time for refinement 5.498152e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.051335e-07 max(|| b_i - A x_i ||_1) 8.754163e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.208937e+00 (SUCCESS) Start 1514: shm_example_simple_lap_c_facto1_sched4_kway_tqrcpend 1354/3626 Test #1515: shm_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpbegin .......***Timeout 208.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.600959e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.495399e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.387508e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.202413e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.865065e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.772907e-03 s Time to initialize coeftab 1.525399e+00 s Time to factorize 8.474914e+00 s ( 2.51 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 7.359678e-01 s - iteration 1 : total iteration time 0.488 s error 4.9294e-11 Time for refinement 1.039643e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.464942e-08 max(|| b_i - A x_i ||_1) 3.251811e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.205292e-01 (SUCCESS) Start 1515: shm_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpbegin 1354/3626 Test #1516: shm_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpend .........***Timeout 208.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.891968e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.289574e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.887907e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.865660e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.801340e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.724173e-01 s Time to initialize coeftab 1.092543e-01 s Time to factorize 4.129158e+00 s ( 5.16 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 6.841438e-01 s Time for refinement 4.684566e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.084613e-07 max(|| b_i - A x_i ||_1) 8.851157e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.233412e+00 (SUCCESS) Start 1516: shm_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpend 1354/3626 Test #1517: shm_example_simple_lap_c_facto1_sched4_not_rqrrtbegin ...................***Timeout 208.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.365457e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.005383e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.869077e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.089189e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.268616e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.869245e-02 s Time to initialize coeftab 1.221233e+00 s Time to factorize 1.516270e+01 s ( 1.41 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 5.454198e-01 s - iteration 1 : total iteration time 0.432 s error 4.9817e-11 Time for refinement 8.614392e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.628810e-08 max(|| b_i - A x_i ||_1) 3.337029e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.420321e-01 (SUCCESS) Start 1517: shm_example_simple_lap_c_facto1_sched4_not_rqrrtbegin 1354/3626 Test #1520: shm_example_simple_lap_c_facto1_sched4_kway_rqrrtend ....................***Timeout 208.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.767495e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.726090e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.494910e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.154822e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.611419e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.082011e-01 s Time to initialize coeftab 6.468256e-02 s Time to factorize 3.502584e+00 s ( 6.08 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 8.845849e-01 s Time for refinement 5.779001e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.042851e-07 max(|| b_i - A x_i ||_1) 8.776254e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.214512e+00 (SUCCESS) Start 1520: shm_example_simple_lap_c_facto1_sched4_kway_rqrrtend 1354/3626 Test #1522: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtend .........***Timeout 207.81 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.474202e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.581536e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.273408e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.119048e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.845547e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.267837e-01 s Time to initialize coeftab 2.345452e-01 s Time to factorize 3.255826e+00 s ( 6.54 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 6.669568e-01 s Time for refinement 1.969502e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.062632e-07 max(|| b_i - A x_i ||_1) 8.821896e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.226028e+00 (SUCCESS) Start 1522: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtend 1354/3626 Test #1523: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpilu0 ...................***Timeout 207.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.953990e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.200734e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.789998e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.988386e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.303106e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.492441e-01 s Time to initialize coeftab 9.221181e-02 s Time to factorize 2.980234e+00 s ( 7.15 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 8.264084e-01 s Time for refinement 6.297706e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.112943e-07 max(|| b_i - A x_i ||_1) 1.161912e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.931850e+00 (SUCCESS) Start 1523: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpilu0 1354/3626 Test #1524: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpilu1 ...................***Timeout 207.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.566108e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.313455e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.521498e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.892521e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.339183e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.367605e-01 s Time to initialize coeftab 2.920504e-01 s Time to factorize 3.981274e+00 s ( 5.35 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 8.674283e-01 s Time for refinement 3.308296e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.079259e-07 max(|| b_i - A x_i ||_1) 8.835228e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.229392e+00 (SUCCESS) Start 1524: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpilu1 1354/3626 Test #1526: shm_example_simple_lap_c_facto2_sched4_not_svdend .......................***Timeout 206.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.512510e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.755876e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.159347e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 8.474708e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.054353e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.670532e-03 s Time to initialize coeftab 4.771670e-02 s Time to factorize 8.110244e+00 s ( 4.93 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 4.029615e-01 s Time for refinement 2.130685e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.974942e-07 max(|| b_i - A x_i ||_1) 8.436051e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.128668e+00 (SUCCESS) Start 1526: shm_example_simple_lap_c_facto2_sched4_not_svdend 1354/3626 Test #1528: shm_example_simple_lap_c_facto2_sched4_kway_svdend ......................***Timeout 206.76 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.054933e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.474442e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.820778e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.041320e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.376735e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.188987e-02 s Time to initialize coeftab 2.013616e-02 s Time to factorize 8.618870e+00 s ( 4.64 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 4.651920e-01 s Time for refinement 1.550908e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.998192e-07 max(|| b_i - A x_i ||_1) 8.516330e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.148925e+00 (SUCCESS) Start 1528: shm_example_simple_lap_c_facto2_sched4_kway_svdend Start 1858: c_mpi_rep_example_step-by-step_lap_z_facto2 Start 1859: c_mpi_rep_example_step-by-step_lap_z_facto3 Start 1860: c_mpi_rep_example_step-by-step_lap_z_facto4 Start 1861: c_mpi_rep_example_personal_lap_s_facto0 Start 1862: c_mpi_rep_example_personal_lap_s_facto1 Start 1863: c_mpi_rep_example_personal_lap_s_facto2 Start 1864: c_mpi_rep_example_personal_lap_d_facto0 Start 1865: c_mpi_rep_example_personal_lap_d_facto1 1354/3626 Test #1530: shm_example_simple_lap_c_facto2_sched4_kwayprojections_svdend ........... Passed 203.61 sec 1355/3626 Test #1538: shm_example_simple_lap_c_facto2_sched4_not_rqrcpend ..................... Passed 194.74 sec 1356/3626 Test #1532: shm_example_simple_lap_c_facto2_sched4_not_pqrcpend ..................... Passed 203.04 sec 1357/3626 Test #1550: shm_example_simple_lap_c_facto2_sched4_not_rqrrtend ..................... Passed 175.76 sec 1358/3626 Test #1540: shm_example_simple_lap_c_facto2_sched4_kway_rqrcpend .................... Passed 190.44 sec Test #1068: shm_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpend ......... Passed 184.59 sec Test #1092: shm_example_simple_lap_c_facto4_sched1_kway_rqrcpend .................... Passed 166.06 sec 1361/3626 Test #1534: shm_example_simple_lap_c_facto2_sched4_kway_pqrcpend .................... Passed 202.51 sec 1362/3626 Test #1555: shm_example_simple_lap_c_facto2_sched4_kway_pqrcpilu0 ................... Passed 174.29 sec 1363/3626 Test #1536: shm_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpend ......... Passed 196.56 sec 1364/3626 Test #1535: shm_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpbegin ....... Passed 198.32 sec 1365/3626 Test #1572: shm_example_simple_lap_c_facto3_sched4_kway_rqrcpend .................... Passed 167.39 sec 1366/3626 Test #1562: shm_example_simple_lap_c_facto3_sched4_kwayprojections_svdend ........... Passed 171.80 sec Test #1100: shm_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpend ......... Passed 161.58 sec 1368/3626 Test #1556: shm_example_simple_lap_c_facto2_sched4_kway_pqrcpilu1 ................... Passed 174.28 sec 1369/3626 Test #1574: shm_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpend ......... Passed 166.83 sec 1370/3626 Test #1568: shm_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpend ......... Passed 170.58 sec 1371/3626 Test #1531: shm_example_simple_lap_c_facto2_sched4_not_pqrcpbegin ................... Passed 203.24 sec 1372/3626 Test #1529: shm_example_simple_lap_c_facto2_sched4_kwayprojections_svdbegin ......... Passed 203.96 sec Test #1039: shm_example_simple_lap_c_facto2_sched1_kway_rqrrtbegin ..................***Timeout 213.32 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1039: shm_example_simple_lap_c_facto2_sched1_kway_rqrrtbegin Test #1041: shm_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtbegin .......***Timeout 213.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1041: shm_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtbegin Test #1045: shm_example_simple_lap_c_facto3_sched1_not_svdbegin .....................***Timeout 213.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1045: shm_example_simple_lap_c_facto3_sched1_not_svdbegin Test #1047: shm_example_simple_lap_c_facto3_sched1_kway_svdbegin ....................***Timeout 213.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1047: shm_example_simple_lap_c_facto3_sched1_kway_svdbegin 1373/3626 Test #1450: shm_example_simple_lap_d_facto2_sched4_kway_tqrcpend ....................***Timeout 213.19 sec Start 1450: shm_example_simple_lap_d_facto2_sched4_kway_tqrcpend 1373/3626 Test #1461: shm_example_simple_lap_c_facto0_sched4_not_svdbegin .....................***Timeout 213.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1461: shm_example_simple_lap_c_facto0_sched4_not_svdbegin 1373/3626 Test #1465: shm_example_simple_lap_c_facto0_sched4_kwayprojections_svdbegin .........***Timeout 213.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1465: shm_example_simple_lap_c_facto0_sched4_kwayprojections_svdbegin 1373/3626 Test #1525: shm_example_simple_lap_c_facto2_sched4_not_svdbegin .....................***Timeout 208.20 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1525: shm_example_simple_lap_c_facto2_sched4_not_svdbegin 1373/3626 Test #1527: shm_example_simple_lap_c_facto2_sched4_kway_svdbegin ....................***Timeout 207.75 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1527: shm_example_simple_lap_c_facto2_sched4_kway_svdbegin Start 1866: c_mpi_rep_example_personal_lap_d_facto2 Start 1867: c_mpi_rep_example_personal_lap_c_facto0 Start 1868: c_mpi_rep_example_personal_lap_c_facto1 Start 1869: c_mpi_rep_example_personal_lap_c_facto2 Start 1870: c_mpi_rep_example_personal_lap_c_facto3 Start 1871: c_mpi_rep_example_personal_lap_c_facto4 Start 1872: c_mpi_rep_example_personal_lap_z_facto0 Start 1873: c_mpi_rep_example_personal_lap_z_facto1 Start 1874: c_mpi_rep_example_personal_lap_z_facto2 Start 1875: c_mpi_rep_example_personal_lap_z_facto3 Start 1876: c_mpi_rep_example_personal_lap_z_facto4 Start 1877: c_mpi_rep_example_simple_scotch_rsa Start 1878: c_mpi_rep_example_simple_scotch_mm Start 1879: c_mpi_rep_example_simple_scotch_hb Start 1880: c_mpi_rep_example_simple_scotch_mm2 Start 1881: c_mpi_rep_example_simple_single_rsa Start 1882: c_mpi_rep_example_simple_single_mm Start 1883: c_mpi_rep_example_simple_single_hb Start 1884: c_mpi_rep_example_simple_single_mm2 1373/3626 Test #1557: shm_example_simple_lap_c_facto3_sched4_not_svdbegin ..................... Passed 175.42 sec 1374/3626 Test #1569: shm_example_simple_lap_c_facto3_sched4_not_rqrcpbegin ................... Passed 171.77 sec Test #1095: shm_example_simple_lap_c_facto4_sched1_not_tqrcpbegin ................... Passed 165.90 sec 1376/3626 Test #1561: shm_example_simple_lap_c_facto3_sched4_kwayprojections_svdbegin ......... Passed 173.34 sec 1377/3626 Test #1563: shm_example_simple_lap_c_facto3_sched4_not_pqrcpbegin ................... Passed 172.88 sec 1378/3626 Test #1537: shm_example_simple_lap_c_facto2_sched4_not_rqrcpbegin ................... Passed 197.58 sec 1379/3626 Test #1613: shm_example_simple_lap_c_facto4_sched4_not_rqrrtbegin ................... Passed 149.48 sec 1380/3626 Test #1611: shm_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpbegin ....... Passed 149.71 sec 1381/3626 Test #1608: shm_example_simple_lap_c_facto4_sched4_not_tqrcpend ..................... Passed 150.85 sec 1382/3626 Test #1588: shm_example_simple_lap_c_facto3_sched4_kway_pqrcpilu1 ................... Passed 159.65 sec 1383/3626 Test #1609: shm_example_simple_lap_c_facto4_sched4_kway_tqrcpbegin .................. Passed 149.99 sec 1384/3626 Test #1631: shm_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpbegin ....... Passed 143.16 sec Test #1138: shm_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtend ......... Passed 141.01 sec 1386/3626 Test #1598: shm_example_simple_lap_c_facto4_sched4_kway_pqrcpend .................... Passed 154.77 sec 1387/3626 Test #1610: shm_example_simple_lap_c_facto4_sched4_kway_tqrcpend .................... Passed 149.85 sec 1388/3626 Test #1624: shm_example_simple_lap_z_facto0_sched4_kway_svdend ...................... Passed 145.08 sec 1389/3626 Test #1539: shm_example_simple_lap_c_facto2_sched4_kway_rqrcpbegin .................. Passed 194.62 sec 1390/3626 Test #1533: shm_example_simple_lap_c_facto2_sched4_kway_pqrcpbegin ..................***Timeout 204.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1533: shm_example_simple_lap_c_facto2_sched4_kway_pqrcpbegin 1390/3626 Test #1548: shm_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpend ......... Passed 177.83 sec 1391/3626 Test #1581: shm_example_simple_lap_c_facto3_sched4_not_rqrrtbegin ................... Passed 164.01 sec Start 1885: c_mpi_rep_example_step-by-step_single_rsa Start 1886: c_mpi_rep_example_step-by-step_single_mm Start 1887: c_mpi_rep_example_step-by-step_single_hb Start 1888: c_mpi_rep_example_step-by-step_single_mm2 Start 1889: c_mpi_rep_example_simple_refine_cg Start 1890: c_mpi_rep_example_simple_refine_gmres Start 1891: c_mpi_rep_example_simple_refine_bicgstab Start 1892: c_mpi_rep_example_refinement_lap_s_refine_cg_sym Start 1893: c_mpi_rep_example_refinement_lap_s_refine_gmres_sym Start 1894: c_mpi_rep_example_refinement_lap_s_refine_bicgstab_sym Start 1895: c_mpi_rep_example_refinement_lap_d_refine_cg_sym Start 1896: c_mpi_rep_example_refinement_lap_d_refine_gmres_sym Start 1897: c_mpi_rep_example_refinement_lap_d_refine_bicgstab_sym Start 1898: c_mpi_rep_example_refinement_lap_c_refine_cg_her Start 1899: c_mpi_rep_example_refinement_lap_c_refine_gmres_her Start 1900: c_mpi_rep_example_refinement_lap_c_refine_bicgstab_her Start 1901: c_mpi_rep_example_refinement_lap_c_refine_cg_sym Start 1902: c_mpi_rep_example_refinement_lap_c_refine_gmres_sym Start 1903: c_mpi_rep_example_refinement_lap_c_refine_bicgstab_sym Test #1121: shm_example_simple_lap_z_facto0_sched1_not_rqrcpbegin ................... Passed 154.40 sec Start 1904: c_mpi_rep_example_refinement_lap_z_refine_cg_her 1393/3626 Test #1628: shm_example_simple_lap_z_facto0_sched4_not_pqrcpend ..................... Passed 145.00 sec Start 1905: c_mpi_rep_example_refinement_lap_z_refine_gmres_her 1394/3626 Test #1623: shm_example_simple_lap_z_facto0_sched4_kway_svdbegin .................... Passed 146.68 sec Start 1906: c_mpi_rep_example_refinement_lap_z_refine_bicgstab_her 1395/3626 Test #1546: shm_example_simple_lap_c_facto2_sched4_kway_tqrcpend .................... Passed 183.29 sec Start 1907: c_mpi_rep_example_refinement_lap_z_refine_cg_sym Test #1098: shm_example_simple_lap_c_facto4_sched1_kway_tqrcpend .................... Passed 164.32 sec 1397/3626 Test #1606: shm_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpend ......... Passed 153.89 sec Start 1908: c_mpi_rep_example_refinement_lap_z_refine_gmres_sym Start 1909: c_mpi_rep_example_refinement_lap_z_refine_bicgstab_sym 1398/3626 Test #1578: shm_example_simple_lap_c_facto3_sched4_kway_tqrcpend .................... Passed 165.88 sec 1399/3626 Test #1596: shm_example_simple_lap_c_facto4_sched4_not_pqrcpend ..................... Passed 156.68 sec Start 1910: c_mpi_rep_example_simple_mixed_refine_cg Start 1911: c_mpi_rep_example_simple_mixed_refine_gmres 1400/3626 Test #1618: shm_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtend ......... Passed 149.20 sec Start 1912: c_mpi_rep_example_simple_mixed_refine_bicgstab 1401/3626 Test #1551: shm_example_simple_lap_c_facto2_sched4_kway_rqrrtbegin .................. Passed 179.38 sec Start 1913: c_mpi_rep_example_simple_mixed_lap_d_refine_cg_sym 1402/3626 Test #1542: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpend ......... Passed 191.39 sec 1403/3626 Test #1560: shm_example_simple_lap_c_facto3_sched4_kway_svdend ...................... Passed 176.41 sec Start 1914: c_mpi_rep_example_simple_mixed_lap_d_refine_gmres_sym Start 1915: c_mpi_rep_example_simple_mixed_lap_d_refine_bicgstab_sym 1404/3626 Test #1544: shm_example_simple_lap_c_facto2_sched4_not_tqrcpend ..................... Passed 186.16 sec Start 1916: c_mpi_rep_example_simple_mixed_lap_z_refine_cg_her 1405/3626 Test #1648: shm_example_simple_lap_z_facto0_sched4_kway_rqrrtend .................... Passed 130.35 sec Start 1917: c_mpi_rep_example_simple_mixed_lap_z_refine_gmres_her 1406/3626 Test #1615: shm_example_simple_lap_c_facto4_sched4_kway_rqrrtbegin .................. Passed 151.09 sec Start 1918: c_mpi_rep_example_simple_mixed_lap_z_refine_bicgstab_her 1407/3626 Test #1635: shm_example_simple_lap_z_facto0_sched4_kway_rqrcpbegin .................. Passed 139.62 sec Start 1919: c_mpi_rep_example_simple_mixed_lap_z_refine_cg_sym 1408/3626 Test #1629: shm_example_simple_lap_z_facto0_sched4_kway_pqrcpbegin .................. Passed 147.05 sec Start 1920: c_mpi_rep_example_simple_mixed_lap_z_refine_gmres_sym 1409/3626 Test #1622: shm_example_simple_lap_z_facto0_sched4_not_svdend ....................... Passed 149.08 sec Start 1921: c_mpi_rep_example_simple_mixed_lap_z_refine_bicgstab_sym 1410/3626 Test #1552: shm_example_simple_lap_c_facto2_sched4_kway_rqrrtend .................... Passed 180.62 sec Start 1922: mpi_rep_example_simple_lap_s_facto0_sched0_1d 1411/3626 Test #1592: shm_example_simple_lap_c_facto4_sched4_kway_svdend ...................... Passed 160.52 sec Start 1923: mpi_rep_example_simple_lap_s_facto1_sched0_1d Test #1101: shm_example_simple_lap_c_facto4_sched1_not_rqrrtbegin ................... Passed 165.29 sec Start 1924: mpi_rep_example_simple_lap_s_facto2_sched0_1d 1413/3626 Test #1612: shm_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpend ......... Passed 153.94 sec Start 1925: mpi_rep_example_simple_lap_d_facto0_sched0_1d 1414/3626 Test #1573: shm_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpbegin ....... Passed 172.94 sec Start 1926: mpi_rep_example_simple_lap_d_facto1_sched0_1d Test #1118: shm_example_simple_lap_z_facto0_sched1_kway_pqrcpend .................... Passed 160.42 sec Start 1927: mpi_rep_example_simple_lap_d_facto2_sched0_1d 1416/3626 Test #1547: shm_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpbegin ....... Passed 186.00 sec Start 1928: mpi_rep_example_simple_lap_c_facto0_sched0_1d Test #1103: shm_example_simple_lap_c_facto4_sched1_kway_rqrrtbegin .................. Passed 165.51 sec Start 1929: mpi_rep_example_simple_lap_c_facto1_sched0_1d 1418/3626 Test #1654: shm_example_simple_lap_z_facto1_sched4_not_svdend ....................... Passed 129.76 sec Start 1930: mpi_rep_example_simple_lap_c_facto2_sched0_1d 1419/3626 Test #1617: shm_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtbegin ....... Passed 153.62 sec Start 1931: mpi_rep_example_simple_lap_c_facto3_sched0_1d 1420/3626 Test #1565: shm_example_simple_lap_c_facto3_sched4_kway_pqrcpbegin .................. Passed 178.53 sec Start 1932: mpi_rep_example_simple_lap_c_facto4_sched0_1d 1421/3626 Test #1637: shm_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpbegin ....... Passed 141.48 sec Start 1933: mpi_rep_example_simple_lap_z_facto0_sched0_1d 1422/3626 Test #1641: shm_example_simple_lap_z_facto0_sched4_kway_tqrcpbegin .................. Passed 140.05 sec Start 1934: mpi_rep_example_simple_lap_z_facto1_sched0_1d 1423/3626 Test #1541: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpbegin .......***Timeout 200.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1541: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpbegin 1423/3626 Test #1583: shm_example_simple_lap_c_facto3_sched4_kway_rqrrtbegin .................. Passed 170.90 sec Start 1935: mpi_rep_example_simple_lap_z_facto2_sched0_1d 1424/3626 Test #1614: shm_example_simple_lap_c_facto4_sched4_not_rqrrtend ..................... Passed 160.05 sec Start 1936: mpi_rep_example_simple_lap_z_facto3_sched0_1d 1425/3626 Test #1627: shm_example_simple_lap_z_facto0_sched4_not_pqrcpbegin ................... Passed 156.44 sec Start 1937: mpi_rep_example_simple_lap_z_facto4_sched0_1d 1426/3626 Test #1645: shm_example_simple_lap_z_facto0_sched4_not_rqrrtbegin ................... Passed 142.74 sec Start 1938: mpi_rep_example_simple_lap_s_facto0_sched1_1d 1427/3626 Test #1625: shm_example_simple_lap_z_facto0_sched4_kwayprojections_svdbegin ......... Passed 159.13 sec Start 1939: mpi_rep_example_simple_lap_s_facto1_sched1_1d 1428/3626 Test #1567: shm_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpbegin ....... Passed 186.56 sec Start 1940: mpi_rep_example_simple_lap_s_facto2_sched1_1d 1429/3626 Test #1638: shm_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpend ......... Passed 148.90 sec Start 1941: mpi_rep_example_simple_lap_d_facto0_sched1_1d 1430/3626 Test #1543: shm_example_simple_lap_c_facto2_sched4_not_tqrcpbegin ...................***Timeout 200.15 sec Start 1543: shm_example_simple_lap_c_facto2_sched4_not_tqrcpbegin 1430/3626 Test #1595: shm_example_simple_lap_c_facto4_sched4_not_pqrcpbegin ................... Passed 170.30 sec Start 1942: mpi_rep_example_simple_lap_d_facto1_sched1_1d 1431/3626 Test #1644: shm_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpend ......... Passed 145.68 sec Start 1943: mpi_rep_example_simple_lap_d_facto2_sched1_1d 1432/3626 Test #1633: shm_example_simple_lap_z_facto0_sched4_not_rqrcpbegin ................... Passed 155.01 sec Start 1944: mpi_rep_example_simple_lap_c_facto0_sched1_1d 1433/3626 Test #1559: shm_example_simple_lap_c_facto3_sched4_kway_svdbegin .................... Passed 191.12 sec Start 1945: mpi_rep_example_simple_lap_c_facto1_sched1_1d 1434/3626 Test #1558: shm_example_simple_lap_c_facto3_sched4_not_svdend ....................... Passed 191.35 sec Start 1946: mpi_rep_example_simple_lap_c_facto2_sched1_1d 1435/3626 Test #1621: shm_example_simple_lap_z_facto0_sched4_not_svdbegin ..................... Passed 163.73 sec Start 1947: mpi_rep_example_simple_lap_c_facto3_sched1_1d 1436/3626 Test #1607: shm_example_simple_lap_c_facto4_sched4_not_tqrcpbegin ................... Passed 169.43 sec Start 1948: mpi_rep_example_simple_lap_c_facto4_sched1_1d 1437/3626 Test #1632: shm_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpend ......... Passed 160.11 sec Start 1949: mpi_rep_example_simple_lap_z_facto0_sched1_1d 1438/3626 Test #1590: shm_example_simple_lap_c_facto4_sched4_not_svdend ....................... Passed 177.68 sec Start 1950: mpi_rep_example_simple_lap_z_facto1_sched1_1d 1439/3626 Test #1545: shm_example_simple_lap_c_facto2_sched4_kway_tqrcpbegin ..................***Timeout 200.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1545: shm_example_simple_lap_c_facto2_sched4_kway_tqrcpbegin Test #1079: shm_example_simple_lap_c_facto4_sched1_kway_svdbegin ....................***Timeout 201.05 sec Start 1079: shm_example_simple_lap_c_facto4_sched1_kway_svdbegin 1439/3626 Test #1579: shm_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpbegin ....... Passed 184.05 sec Start 1951: mpi_rep_example_simple_lap_z_facto2_sched1_1d 1440/3626 Test #1630: shm_example_simple_lap_z_facto0_sched4_kway_pqrcpend .................... Passed 163.93 sec Start 1952: mpi_rep_example_simple_lap_z_facto3_sched1_1d 1441/3626 Test #1616: shm_example_simple_lap_c_facto4_sched4_kway_rqrrtend .................... Passed 168.90 sec Start 1953: mpi_rep_example_simple_lap_z_facto4_sched1_1d 1442/3626 Test #1586: shm_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtend ......... Passed 181.96 sec Start 1954: mpi_rep_example_simple_lap_s_facto0_sched4_1d Test #1104: shm_example_simple_lap_c_facto4_sched1_kway_rqrrtend .................... Passed 182.65 sec Start 1955: mpi_rep_example_simple_lap_s_facto1_sched4_1d 1444/3626 Test #1650: shm_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtend ......... Passed 148.31 sec Start 1956: mpi_rep_example_simple_lap_s_facto2_sched4_1d 1445/3626 Test #1602: shm_example_simple_lap_c_facto4_sched4_not_rqrcpend ..................... Passed 176.66 sec Start 1957: mpi_rep_example_simple_lap_d_facto0_sched4_1d 1446/3626 Test #1626: shm_example_simple_lap_z_facto0_sched4_kwayprojections_svdend ........... Passed 167.02 sec Start 1958: mpi_rep_example_simple_lap_d_facto1_sched4_1d 1447/3626 Test #1690: shm_example_simple_lap_z_facto2_sched4_kwayprojections_svdend ........... Passed 124.61 sec Start 1959: mpi_rep_example_simple_lap_d_facto2_sched4_1d 1448/3626 Test #1554: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtend ......... Passed 198.43 sec Start 1960: mpi_rep_example_simple_lap_c_facto0_sched4_1d 1449/3626 Test #1549: shm_example_simple_lap_c_facto2_sched4_not_rqrrtbegin ...................***Timeout 200.15 sec Start 1549: shm_example_simple_lap_c_facto2_sched4_not_rqrrtbegin 1449/3626 Test #1564: shm_example_simple_lap_c_facto3_sched4_not_pqrcpend ..................... Passed 195.65 sec Start 1961: mpi_rep_example_simple_lap_c_facto1_sched4_1d 1450/3626 Test #1570: shm_example_simple_lap_c_facto3_sched4_not_rqrcpend ..................... Passed 194.35 sec Start 1962: mpi_rep_example_simple_lap_c_facto2_sched4_1d 1451/3626 Test #1576: shm_example_simple_lap_c_facto3_sched4_not_tqrcpend ..................... Passed 190.02 sec Start 1963: mpi_rep_example_simple_lap_c_facto3_sched4_1d 1452/3626 Test #1584: shm_example_simple_lap_c_facto3_sched4_kway_rqrrtend .................... Passed 184.60 sec Start 1964: mpi_rep_example_simple_lap_c_facto4_sched4_1d 1453/3626 Test #1639: shm_example_simple_lap_z_facto0_sched4_not_tqrcpbegin ................... Passed 157.90 sec Start 1965: mpi_rep_example_simple_lap_z_facto0_sched4_1d 1454/3626 Test #1553: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtbegin .......***Timeout 200.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1553: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtbegin 1454/3626 Test #1593: shm_example_simple_lap_c_facto4_sched4_kwayprojections_svdbegin ......... Passed 180.11 sec Start 1966: mpi_rep_example_simple_lap_z_facto1_sched4_1d 1455/3626 Test #1575: shm_example_simple_lap_c_facto3_sched4_not_tqrcpbegin ................... Passed 191.60 sec Start 1967: mpi_rep_example_simple_lap_z_facto2_sched4_1d 1456/3626 Test #1694: shm_example_simple_lap_z_facto2_sched4_kway_pqrcpend .................... Passed 125.16 sec Start 1968: mpi_rep_example_simple_lap_z_facto3_sched4_1d 1457/3626 Test #1619: shm_example_simple_lap_c_facto4_sched4_kway_pqrcpilu0 ................... Passed 171.45 sec Start 1969: mpi_rep_example_simple_lap_z_facto4_sched4_1d 1458/3626 Test #1647: shm_example_simple_lap_z_facto0_sched4_kway_rqrrtbegin .................. Passed 152.71 sec Start 1970: mpi_dst_example_simple_lap_s_facto0_sched0_1d 1459/3626 Test #1587: shm_example_simple_lap_c_facto3_sched4_kway_pqrcpilu0 ................... Passed 184.82 sec Start 1971: mpi_dst_example_simple_lap_s_facto1_sched0_1d 1460/3626 Test #1683: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpilu0 ................... Passed 128.01 sec Start 1972: mpi_dst_example_simple_lap_s_facto2_sched0_1d 1461/3626 Test #1692: shm_example_simple_lap_z_facto2_sched4_not_pqrcpend ..................... Passed 127.15 sec Start 1973: mpi_dst_example_simple_lap_d_facto0_sched0_1d 1462/3626 Test #1642: shm_example_simple_lap_z_facto0_sched4_kway_tqrcpend .................... Passed 157.84 sec Start 1974: mpi_dst_example_simple_lap_d_facto1_sched0_1d 1463/3626 Test #1679: shm_example_simple_lap_z_facto1_sched4_kway_rqrrtbegin .................. Passed 129.13 sec Start 1975: mpi_dst_example_simple_lap_d_facto2_sched0_1d 1464/3626 Test #1688: shm_example_simple_lap_z_facto2_sched4_kway_svdend ...................... Passed 128.33 sec Start 1976: mpi_dst_example_simple_lap_c_facto0_sched0_1d 1465/3626 Test #1605: shm_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpbegin ....... Passed 178.75 sec Start 1977: mpi_dst_example_simple_lap_c_facto1_sched0_1d 1466/3626 Test #1678: shm_example_simple_lap_z_facto1_sched4_not_rqrrtend ..................... Passed 129.86 sec Start 1978: mpi_dst_example_simple_lap_c_facto2_sched0_1d 1467/3626 Test #1620: shm_example_simple_lap_c_facto4_sched4_kway_pqrcpilu1 ................... Passed 173.06 sec Start 1979: mpi_dst_example_simple_lap_c_facto3_sched0_1d 1468/3626 Test #1655: shm_example_simple_lap_z_facto1_sched4_kway_svdbegin .................... Passed 146.60 sec Start 1980: mpi_dst_example_simple_lap_c_facto4_sched0_1d 1469/3626 Test #1566: shm_example_simple_lap_c_facto3_sched4_kway_pqrcpend ....................***Timeout 200.09 sec Start 1566: shm_example_simple_lap_c_facto3_sched4_kway_pqrcpend 1469/3626 Test #1604: shm_example_simple_lap_c_facto4_sched4_kway_rqrcpend .................... Passed 182.28 sec Start 1981: mpi_dst_example_simple_lap_z_facto0_sched0_1d 1470/3626 Test #1601: shm_example_simple_lap_c_facto4_sched4_not_rqrcpbegin ................... Passed 182.80 sec Start 1982: mpi_dst_example_simple_lap_z_facto1_sched0_1d 1471/3626 Test #1571: shm_example_simple_lap_c_facto3_sched4_kway_rqrcpbegin ..................***Timeout 200.07 sec Start 1571: shm_example_simple_lap_c_facto3_sched4_kway_rqrcpbegin Test #1122: shm_example_simple_lap_z_facto0_sched1_not_rqrcpend ..................... Passed 180.51 sec Start 1983: mpi_dst_example_simple_lap_z_facto2_sched0_1d 1472/3626 Test #1689: shm_example_simple_lap_z_facto2_sched4_kwayprojections_svdbegin ......... Passed 131.48 sec Start 1984: mpi_dst_example_simple_lap_z_facto3_sched0_1d 1473/3626 Test #1667: shm_example_simple_lap_z_facto1_sched4_kway_rqrcpbegin .................. Passed 139.09 sec Start 1985: mpi_dst_example_simple_lap_z_facto4_sched0_1d 1474/3626 Test #1653: shm_example_simple_lap_z_facto1_sched4_not_svdbegin ..................... Passed 154.70 sec Start 1986: mpi_dst_example_simple_lap_s_facto0_sched1_1d 1475/3626 Test #1662: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpend .................... Passed 145.17 sec Start 1987: mpi_dst_example_simple_lap_s_facto1_sched1_1d 1476/3626 Test #1643: shm_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpbegin ....... Passed 161.56 sec Start 1988: mpi_dst_example_simple_lap_s_facto2_sched1_1d Test #850: shm_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtbegin ....... Passed 190.60 sec Start 1989: mpi_dst_example_simple_lap_d_facto0_sched1_1d 1478/3626 Test #1640: shm_example_simple_lap_z_facto0_sched4_not_tqrcpend ..................... Passed 165.95 sec Start 1990: mpi_dst_example_simple_lap_d_facto1_sched1_1d 1479/3626 Test #1677: shm_example_simple_lap_z_facto1_sched4_not_rqrrtbegin ................... Passed 137.14 sec Start 1991: mpi_dst_example_simple_lap_d_facto2_sched1_1d 1480/3626 Test #1708: shm_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpend ......... Passed 123.46 sec Start 1992: mpi_dst_example_simple_lap_c_facto0_sched1_1d 1481/3626 Test #1577: shm_example_simple_lap_c_facto3_sched4_kway_tqrcpbegin .................. Passed 199.69 sec Start 1993: mpi_dst_example_simple_lap_c_facto1_sched1_1d 1482/3626 Test #1699: shm_example_simple_lap_z_facto2_sched4_kway_rqrcpbegin .................. Passed 127.30 sec Start 1994: mpi_dst_example_simple_lap_c_facto2_sched1_1d 1483/3626 Test #1580: shm_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpend .........***Timeout 200.16 sec Start 1580: shm_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpend 1483/3626 Test #1597: shm_example_simple_lap_c_facto4_sched4_kway_pqrcpbegin .................. Passed 191.40 sec Start 1995: mpi_dst_example_simple_lap_c_facto3_sched1_1d 1484/3626 Test #1663: shm_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpbegin ....... Passed 149.86 sec Start 1996: mpi_dst_example_simple_lap_c_facto4_sched1_1d 1485/3626 Test #1673: shm_example_simple_lap_z_facto1_sched4_kway_tqrcpbegin .................. Passed 141.66 sec Start 1997: mpi_dst_example_simple_lap_z_facto0_sched1_1d 1486/3626 Test #1784: c_mpi_rep_example_analyze_lap_d_facto0 .................................. Passed 91.35 sec Start 1998: mpi_dst_example_simple_lap_z_facto1_sched1_1d 1487/3626 Test #1700: shm_example_simple_lap_z_facto2_sched4_kway_rqrcpend .................... Passed 128.69 sec Start 1999: mpi_dst_example_simple_lap_z_facto2_sched1_1d Test #1187: shm_example_simple_lap_z_facto2_sched1_kway_rqrcpbegin .................. Passed 152.17 sec Start 2000: mpi_dst_example_simple_lap_z_facto3_sched1_1d 1489/3626 Test #1582: shm_example_simple_lap_c_facto3_sched4_not_rqrrtend .....................***Timeout 200.08 sec Start 1582: shm_example_simple_lap_c_facto3_sched4_not_rqrrtend 1489/3626 Test #1696: shm_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpend ......... Passed 139.17 sec Start 2001: mpi_dst_example_simple_lap_z_facto4_sched1_1d Test #950: shm_example_simple_lap_c_facto0_sched1_not_svdend ....................... Passed 134.53 sec Start 2002: mpi_dst_example_simple_lap_s_facto0_sched4_1d 1491/3626 Test #1585: shm_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtbegin .......***Timeout 200.67 sec Start 1585: shm_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtbegin Test #1106: shm_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtend .........***Timeout 200.04 sec Start 1106: shm_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtend 1491/3626 Test #1705: shm_example_simple_lap_z_facto2_sched4_kway_tqrcpbegin .................. Passed 131.45 sec Start 2003: mpi_dst_example_simple_lap_s_facto1_sched4_1d 1492/3626 Test #1589: shm_example_simple_lap_c_facto4_sched4_not_svdbegin .....................***Timeout 200.11 sec Start 1589: shm_example_simple_lap_c_facto4_sched4_not_svdbegin 1492/3626 Test #1591: shm_example_simple_lap_c_facto4_sched4_kway_svdbegin ....................***Timeout 200.35 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1591: shm_example_simple_lap_c_facto4_sched4_kway_svdbegin 1492/3626 Test #1801: c_mpi_rep_example_simple_lap_d_facto1 ................................... Passed 87.82 sec Start 2004: mpi_dst_example_simple_lap_s_facto2_sched4_1d 1493/3626 Test #1594: shm_example_simple_lap_c_facto4_sched4_kwayprojections_svdend ...........***Timeout 200.07 sec Start 1594: shm_example_simple_lap_c_facto4_sched4_kwayprojections_svdend 1493/3626 Test #1703: shm_example_simple_lap_z_facto2_sched4_not_tqrcpbegin ................... Passed 136.26 sec Start 2005: mpi_dst_example_simple_lap_d_facto0_sched4_1d 1494/3626 Test #1599: shm_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpbegin .......***Timeout 200.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1599: shm_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpbegin 1494/3626 Test #1600: shm_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpend .........***Timeout 200.06 sec Start 1600: shm_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpend 1494/3626 Test #1603: shm_example_simple_lap_c_facto4_sched4_kway_rqrcpbegin ..................***Timeout 200.07 sec Start 1603: shm_example_simple_lap_c_facto4_sched4_kway_rqrcpbegin 1494/3626 Test #1711: shm_example_simple_lap_z_facto2_sched4_kway_rqrrtbegin .................. Passed 132.01 sec Start 2006: mpi_dst_example_simple_lap_d_facto1_sched4_1d Test #1126: shm_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpend ......... Passed 196.38 sec Start 2007: mpi_dst_example_simple_lap_d_facto2_sched4_1d 1496/3626 Test #1680: shm_example_simple_lap_z_facto1_sched4_kway_rqrrtend .................... Passed 151.17 sec Start 2008: mpi_dst_example_simple_lap_c_facto0_sched4_1d 1497/3626 Test #1651: shm_example_simple_lap_z_facto0_sched4_kway_pqrcpilu0 ................... Passed 174.64 sec Start 2009: mpi_dst_example_simple_lap_c_facto1_sched4_1d 1498/3626 Test #1704: shm_example_simple_lap_z_facto2_sched4_not_tqrcpend ..................... Passed 142.41 sec Start 2010: mpi_dst_example_simple_lap_c_facto2_sched4_1d Test #1223: shm_example_simple_lap_z_facto3_sched1_not_tqrcpbegin ................... Passed 136.40 sec Start 2011: mpi_dst_example_simple_lap_c_facto3_sched4_1d 1500/3626 Test #1693: shm_example_simple_lap_z_facto2_sched4_kway_pqrcpbegin .................. Passed 153.20 sec Start 2012: mpi_dst_example_simple_lap_c_facto4_sched4_1d Test #1195: shm_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpbegin ....... Passed 160.47 sec Start 2013: mpi_dst_example_simple_lap_z_facto0_sched4_1d 1502/3626 Test #1698: shm_example_simple_lap_z_facto2_sched4_not_rqrcpend ..................... Passed 146.95 sec Start 2014: mpi_dst_example_simple_lap_z_facto1_sched4_1d 1503/3626 Test #1809: c_mpi_rep_example_simple_lap_z_facto1 ................................... Passed 97.75 sec Start 2015: mpi_dst_example_simple_lap_z_facto2_sched4_1d 1504/3626 Test #1674: shm_example_simple_lap_z_facto1_sched4_kway_tqrcpend .................... Passed 158.57 sec Start 2016: mpi_dst_example_simple_lap_z_facto3_sched4_1d 1505/3626 Test #1664: shm_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpend ......... Passed 166.61 sec Start 2017: mpi_dst_example_simple_lap_z_facto4_sched4_1d 1506/3626 Test #1697: shm_example_simple_lap_z_facto2_sched4_not_rqrcpbegin ................... Passed 155.97 sec Start 2018: mpi_dst_example_simple_lap_s_facto0_sched0_not_svdbegin 1507/3626 Test #1691: shm_example_simple_lap_z_facto2_sched4_not_pqrcpbegin ................... Passed 158.93 sec Start 2019: mpi_dst_example_simple_lap_s_facto0_sched0_not_svdend Test #878: shm_example_simple_lap_d_facto0_sched1_not_rqrrtbegin ...................***Timeout 200.10 sec Start 2020: mpi_dst_example_simple_lap_s_facto0_sched0_kway_svdbegin 1509/3626 Test #1634: shm_example_simple_lap_z_facto0_sched4_not_rqrcpend .....................***Timeout 200.19 sec Start 1634: shm_example_simple_lap_z_facto0_sched4_not_rqrcpend 1509/3626 Test #1813: c_mpi_rep_example_simple_solve_and_refine_lap_s_facto0 .................. Passed 106.14 sec Start 2021: mpi_dst_example_simple_lap_s_facto0_sched0_kway_svdend 1510/3626 Test #1636: shm_example_simple_lap_z_facto0_sched4_kway_rqrcpend ....................***Timeout 200.18 sec Start 1636: shm_example_simple_lap_z_facto0_sched4_kway_rqrcpend 1510/3626 Test #1789: c_mpi_rep_example_analyze_lap_c_facto2 .................................. Passed 114.82 sec Start 2022: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_svdbegin 1511/3626 Test #1788: c_mpi_rep_example_analyze_lap_c_facto1 .................................. Passed 116.95 sec Start 2023: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_svdend 1512/3626 Test #1839: c_mpi_rep_example_simple_trans_lap_c_facto4 ............................. Passed 109.64 sec Start 2024: mpi_dst_example_simple_lap_s_facto0_sched0_not_pqrcpbegin 1513/3626 Test #1840: c_mpi_rep_example_simple_trans_lap_z_facto0 ............................. Passed 110.40 sec Start 2025: mpi_dst_example_simple_lap_s_facto0_sched0_not_pqrcpend 1514/3626 Test #1706: shm_example_simple_lap_z_facto2_sched4_kway_tqrcpend .................... Passed 161.75 sec Start 2026: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpbegin 1515/3626 Test #1778: shm_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtend ......... Passed 127.36 sec Start 2027: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpend 1516/3626 Test #1646: shm_example_simple_lap_z_facto0_sched4_not_rqrrtend .....................***Timeout 200.10 sec Start 1646: shm_example_simple_lap_z_facto0_sched4_not_rqrrtend 1516/3626 Test #1652: shm_example_simple_lap_z_facto0_sched4_kway_pqrcpilu1 ................... Passed 198.39 sec Start 2028: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_pqrcpbegin 1517/3626 Test #1824: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto0 .................. Passed 115.13 sec Start 2029: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_pqrcpend 1518/3626 Test #1649: shm_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtbegin .......***Timeout 200.06 sec Start 1649: shm_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtbegin 1518/3626 Test #1754: shm_example_simple_lap_z_facto4_sched4_kwayprojections_svdend ........... Passed 140.76 sec Start 2030: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrcpbegin 1519/3626 Test #1695: shm_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpbegin ....... Passed 175.81 sec Start 2031: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrcpend 1520/3626 Test #1675: shm_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpbegin ....... Passed 179.30 sec Start 2032: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrcpbegin 1521/3626 Test #1685: shm_example_simple_lap_z_facto2_sched4_not_svdbegin ..................... Passed 178.77 sec Start 2033: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrcpend 1522/3626 Test #1793: c_mpi_rep_example_analyze_lap_z_facto1 .................................. Passed 131.63 sec 1523/3626 Test #1770: shm_example_simple_lap_z_facto4_sched4_kway_tqrcpend .................... Passed 144.69 sec 1524/3626 Test #1758: shm_example_simple_lap_z_facto4_sched4_kway_pqrcpend .................... Passed 149.48 sec Test #1163: shm_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpbegin .......***Timeout 207.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1163: shm_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpbegin Test #1165: shm_example_simple_lap_z_facto1_sched1_not_rqrrtbegin ...................***Timeout 207.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1165: shm_example_simple_lap_z_facto1_sched1_not_rqrrtbegin 1525/3626 Test #1656: shm_example_simple_lap_z_facto1_sched4_kway_svdend ......................***Timeout 205.35 sec Start 1656: shm_example_simple_lap_z_facto1_sched4_kway_svdend Start 2034: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpbegin Start 2035: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpend Start 2036: mpi_dst_example_simple_lap_s_facto0_sched0_not_tqrcpbegin 1525/3626 Test #1798: c_mpi_rep_example_simple_lap_s_facto1 ................................... Passed 135.79 sec 1526/3626 Test #1837: c_mpi_rep_example_simple_trans_lap_c_facto2 ............................. Passed 129.67 sec Test #1235: shm_example_simple_lap_z_facto3_sched1_kway_pqrcpilu0 ................... Passed 162.12 sec 1528/3626 Test #1702: shm_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpend ......... Passed 180.87 sec Test #1283: shm_example_simple_lap_s_facto0_sched4_kway_rqrcpbegin .................. Passed 137.33 sec Test #1279: shm_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpbegin ....... Passed 138.57 sec 1531/3626 Test #1743: shm_example_simple_lap_z_facto3_sched4_kway_rqrrtbegin .................. Passed 157.56 sec Test #1248: shm_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpend ......... Passed 157.17 sec 1533/3626 Test #1763: shm_example_simple_lap_z_facto4_sched4_kway_rqrcpbegin .................. Passed 151.10 sec 1534/3626 Test #1717: shm_example_simple_lap_z_facto3_sched4_not_svdbegin ..................... Passed 167.60 sec 1535/3626 Test #1727: shm_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpbegin ....... Passed 166.48 sec 1536/3626 Test #1687: shm_example_simple_lap_z_facto2_sched4_kway_svdbegin .................... Passed 192.23 sec Test #1178: shm_example_simple_lap_z_facto2_sched1_kwayprojections_svdend ...........***Timeout 207.43 sec Start 1178: shm_example_simple_lap_z_facto2_sched1_kwayprojections_svdend 1537/3626 Test #1657: shm_example_simple_lap_z_facto1_sched4_kwayprojections_svdbegin .........***Timeout 206.59 sec Start 1657: shm_example_simple_lap_z_facto1_sched4_kwayprojections_svdbegin 1537/3626 Test #1658: shm_example_simple_lap_z_facto1_sched4_kwayprojections_svdend ...........***Timeout 206.06 sec Start 1658: shm_example_simple_lap_z_facto1_sched4_kwayprojections_svdend 1537/3626 Test #1659: shm_example_simple_lap_z_facto1_sched4_not_pqrcpbegin ...................***Timeout 206.09 sec Start 1659: shm_example_simple_lap_z_facto1_sched4_not_pqrcpbegin 1537/3626 Test #1660: shm_example_simple_lap_z_facto1_sched4_not_pqrcpend .....................***Timeout 205.87 sec Start 1660: shm_example_simple_lap_z_facto1_sched4_not_pqrcpend 1537/3626 Test #1661: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpbegin ..................***Timeout 205.46 sec Start 1661: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpbegin 1537/3626 Test #1686: shm_example_simple_lap_z_facto2_sched4_not_svdend ....................... Passed 193.09 sec Test #1232: shm_example_simple_lap_z_facto3_sched1_kway_rqrrtend .................... Passed 163.04 sec Start 2037: mpi_dst_example_simple_lap_s_facto0_sched0_not_tqrcpend Start 2038: mpi_dst_example_simple_lap_s_facto0_sched0_kway_tqrcpbegin Start 2039: mpi_dst_example_simple_lap_s_facto0_sched0_kway_tqrcpend Start 2040: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpbegin Start 2041: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpend Start 2042: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrrtbegin Start 2043: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrrtend Start 2044: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrrtbegin Start 2045: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrrtend Start 2046: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtbegin Start 2047: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtend Start 2048: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpilu0 Start 2049: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpilu1 Start 2050: mpi_dst_example_simple_lap_s_facto1_sched0_not_svdbegin 1539/3626 Test #1843: c_mpi_rep_example_simple_trans_lap_z_facto3 ............................. Passed 129.88 sec 1540/3626 Test #1764: shm_example_simple_lap_z_facto4_sched4_kway_rqrcpend .................... Passed 152.23 sec 1541/3626 Test #1728: shm_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpend ......... Passed 168.01 sec 1542/3626 Test #1828: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto4 .................. Passed 131.95 sec 1543/3626 Test #1761: shm_example_simple_lap_z_facto4_sched4_not_rqrcpbegin ................... Passed 155.40 sec 1544/3626 Test #1729: shm_example_simple_lap_z_facto3_sched4_not_rqrcpbegin ................... Passed 167.72 sec Test #1242: shm_example_simple_lap_z_facto4_sched1_kwayprojections_svdend ........... Passed 161.31 sec 1546/3626 Test #1779: shm_example_simple_lap_z_facto4_sched4_kway_pqrcpilu0 ................... Passed 147.23 sec Test #1225: shm_example_simple_lap_z_facto3_sched1_kway_tqrcpbegin .................. Passed 170.74 sec 1548/3626 Test #1731: shm_example_simple_lap_z_facto3_sched4_kway_rqrcpbegin .................. Passed 164.80 sec 1549/3626 Test #1665: shm_example_simple_lap_z_facto1_sched4_not_rqrcpbegin ...................***Timeout 202.31 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1665: shm_example_simple_lap_z_facto1_sched4_not_rqrcpbegin Start 2051: mpi_dst_example_simple_lap_s_facto1_sched0_not_svdend Start 2052: mpi_dst_example_simple_lap_s_facto1_sched0_kway_svdbegin Start 2053: mpi_dst_example_simple_lap_s_facto1_sched0_kway_svdend Start 2054: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_svdbegin Start 2055: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_svdend Start 2056: mpi_dst_example_simple_lap_s_facto1_sched0_not_pqrcpbegin Start 2057: mpi_dst_example_simple_lap_s_facto1_sched0_not_pqrcpend Start 2058: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpbegin Start 2059: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpend Start 2060: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpbegin 1549/3626 Test #1666: shm_example_simple_lap_z_facto1_sched4_not_rqrcpend .....................***Timeout 201.60 sec Start 1666: shm_example_simple_lap_z_facto1_sched4_not_rqrcpend 1549/3626 Test #1668: shm_example_simple_lap_z_facto1_sched4_kway_rqrcpend ....................***Timeout 201.49 sec Start 1668: shm_example_simple_lap_z_facto1_sched4_kway_rqrcpend 1549/3626 Test #1669: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpbegin .......***Timeout 200.21 sec Start 1669: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpbegin 1549/3626 Test #1787: c_mpi_rep_example_analyze_lap_c_facto0 .................................. Passed 143.85 sec Start 2061: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpend 1550/3626 Test #1737: shm_example_simple_lap_z_facto3_sched4_kway_tqrcpbegin .................. Passed 162.23 sec Start 2062: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrcpbegin 1551/3626 Test #1841: c_mpi_rep_example_simple_trans_lap_z_facto1 ............................. Passed 132.40 sec Start 2063: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrcpend 1552/3626 Test #1670: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpend .........***Timeout 200.63 sec Start 1670: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpend 1552/3626 Test #1671: shm_example_simple_lap_z_facto1_sched4_not_tqrcpbegin ...................***Timeout 200.55 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.693084e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.915996e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.597720e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.087825e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.131148e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.347993e-02 s Time to initialize coeftab 1.790004e-01 s Time to factorize 1.409654e+00 s (15.12 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 5.225974e-01 s - iteration 1 : total iteration time 0.576 s error 1.1989e-14 Time for refinement 1.225724e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.199697e-14 max(|| b_i - A x_i ||_1) 1.837348e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.636251e-02 (SUCCESS) Start 1671: shm_example_simple_lap_z_facto1_sched4_not_tqrcpbegin 1552/3626 Test #1791: c_mpi_rep_example_analyze_lap_c_facto4 .................................. Passed 141.33 sec Start 2064: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrcpbegin Test #1328: shm_example_simple_lap_s_facto1_sched4_kway_rqrrtend .................... Passed 129.26 sec Start 2065: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrcpend 1554/3626 Test #1713: shm_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtbegin ....... Passed 174.35 sec Start 2066: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpbegin 1555/3626 Test #1709: shm_example_simple_lap_z_facto2_sched4_not_rqrrtbegin ................... Passed 184.69 sec Start 2067: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpend Test #1246: shm_example_simple_lap_z_facto4_sched1_kway_pqrcpend .................... Passed 162.86 sec Start 2068: mpi_dst_example_simple_lap_s_facto1_sched0_not_tqrcpbegin 1557/3626 Test #1739: shm_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpbegin ....... Passed 163.67 sec Start 2069: mpi_dst_example_simple_lap_s_facto1_sched0_not_tqrcpend 1558/3626 Test #1806: c_mpi_rep_example_simple_lap_c_facto3 ................................... Passed 140.39 sec Start 2070: mpi_dst_example_simple_lap_s_facto1_sched0_kway_tqrcpbegin 1559/3626 Test #1672: shm_example_simple_lap_z_facto1_sched4_not_tqrcpend .....................***Timeout 202.50 sec Start 1672: shm_example_simple_lap_z_facto1_sched4_not_tqrcpend 1559/3626 Test #1676: shm_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpend .........***Timeout 200.31 sec Start 1676: shm_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpend 1559/3626 Test #1786: c_mpi_rep_example_analyze_lap_d_facto2 .................................. Passed 148.68 sec Start 2071: mpi_dst_example_simple_lap_s_facto1_sched0_kway_tqrcpend 1560/3626 Test #1681: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtbegin .......***Timeout 200.36 sec Start 1681: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtbegin 1560/3626 Test #1682: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtend .........***Timeout 200.39 sec Start 1682: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtend 1560/3626 Test #1684: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpilu1 ...................***Timeout 201.08 sec Start 1684: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpilu1 1560/3626 Test #1775: shm_example_simple_lap_z_facto4_sched4_kway_rqrrtbegin .................. Passed 154.56 sec Start 2072: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpbegin 1561/3626 Test #1701: shm_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpbegin ....... Passed 191.00 sec Start 2073: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpend 1562/3626 Test #1774: shm_example_simple_lap_z_facto4_sched4_not_rqrrtend ..................... Passed 156.87 sec Start 2074: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrrtbegin 1563/3626 Test #1814: c_mpi_rep_example_simple_solve_and_refine_lap_s_facto1 .................. Passed 142.94 sec Start 2075: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrrtend 1564/3626 Test #1781: c_mpi_rep_example_analyze_lap_s_facto0 .................................. Passed 157.39 sec Start 2076: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrrtbegin 1565/3626 Test #1848: c_mpi_rep_example_step-by-step_lap_d_facto0 ............................. Passed 135.94 sec Start 2077: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrrtend 1566/3626 Test #1748: shm_example_simple_lap_z_facto3_sched4_kway_pqrcpilu1 ................... Passed 171.30 sec Start 2078: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_rqrrtbegin Test #1203: shm_example_simple_lap_z_facto2_sched1_kway_pqrcpilu0 ...................***Timeout 200.15 sec Start 1203: shm_example_simple_lap_z_facto2_sched1_kway_pqrcpilu0 1567/3626 Test #1732: shm_example_simple_lap_z_facto3_sched4_kway_rqrcpend .................... Passed 177.23 sec Start 2079: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_rqrrtend 1568/3626 Test #1725: shm_example_simple_lap_z_facto3_sched4_kway_pqrcpbegin .................. Passed 181.95 sec Start 2080: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpilu0 1569/3626 Test #1829: c_mpi_rep_example_simple_trans_lap_s_facto0 ............................. Passed 145.78 sec Start 2081: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpilu1 Test #1206: shm_example_simple_lap_z_facto3_sched1_not_svdend ....................... Passed 199.19 sec Start 2082: mpi_dst_example_simple_lap_s_facto2_sched0_not_svdbegin 1571/3626 Test #1712: shm_example_simple_lap_z_facto2_sched4_kway_rqrrtend .................... Passed 192.41 sec Start 2083: mpi_dst_example_simple_lap_s_facto2_sched0_not_svdend 1572/3626 Test #1768: shm_example_simple_lap_z_facto4_sched4_not_tqrcpend ..................... Passed 168.40 sec Start 2084: mpi_dst_example_simple_lap_s_facto2_sched0_kway_svdbegin 1573/3626 Test #1851: c_mpi_rep_example_step-by-step_lap_c_facto0 ............................. Passed 137.20 sec Start 2085: mpi_dst_example_simple_lap_s_facto2_sched0_kway_svdend 1574/3626 Test #1719: shm_example_simple_lap_z_facto3_sched4_kway_svdbegin .................... Passed 186.22 sec Start 2086: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_svdbegin 1575/3626 Test #1749: shm_example_simple_lap_z_facto4_sched4_not_svdbegin ..................... Passed 175.87 sec Start 2087: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_svdend Test #1311: shm_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpbegin ....... Passed 144.88 sec Start 2088: mpi_dst_example_simple_lap_s_facto2_sched0_not_pqrcpbegin 1577/3626 Test #1777: shm_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtbegin ....... Passed 165.65 sec Start 2089: mpi_dst_example_simple_lap_s_facto2_sched0_not_pqrcpend 1578/3626 Test #1707: shm_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpbegin .......***Timeout 200.36 sec Start 1707: shm_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpbegin 1578/3626 Test #1710: shm_example_simple_lap_z_facto2_sched4_not_rqrrtend .....................***Timeout 200.17 sec Start 1710: shm_example_simple_lap_z_facto2_sched4_not_rqrrtend 1578/3626 Test #1715: shm_example_simple_lap_z_facto2_sched4_kway_pqrcpilu0 ................... Passed 191.91 sec Start 2090: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpbegin Test #1320: shm_example_simple_lap_s_facto1_sched4_not_tqrcpend ..................... Passed 147.71 sec Start 2091: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpend 1580/3626 Test #1767: shm_example_simple_lap_z_facto4_sched4_not_tqrcpbegin ................... Passed 176.09 sec Start 2092: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_pqrcpbegin 1581/3626 Test #1776: shm_example_simple_lap_z_facto4_sched4_kway_rqrrtend .................... Passed 172.79 sec Start 2093: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_pqrcpend Test #1341: shm_example_simple_lap_s_facto2_sched4_kway_pqrcpbegin .................. Passed 146.66 sec Start 2094: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrcpbegin 1583/3626 Test #1817: c_mpi_rep_example_simple_solve_and_refine_lap_d_facto1 .................. Passed 162.09 sec Start 2095: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrcpend 1584/3626 Test #1714: shm_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtend .........***Timeout 200.15 sec Start 1714: shm_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtend 1584/3626 Test #1716: shm_example_simple_lap_z_facto2_sched4_kway_pqrcpilu1 ...................***Timeout 217.46 sec Start 1716: shm_example_simple_lap_z_facto2_sched4_kway_pqrcpilu1 1584/3626 Test #1718: shm_example_simple_lap_z_facto3_sched4_not_svdend .......................***Timeout 216.87 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1718: shm_example_simple_lap_z_facto3_sched4_not_svdend Test #1226: shm_example_simple_lap_z_facto3_sched1_kway_tqrcpend ....................***Timeout 216.89 sec Start 1226: shm_example_simple_lap_z_facto3_sched1_kway_tqrcpend 1584/3626 Test #1720: shm_example_simple_lap_z_facto3_sched4_kway_svdend ......................***Timeout 216.72 sec Start 1720: shm_example_simple_lap_z_facto3_sched4_kway_svdend 1584/3626 Test #1721: shm_example_simple_lap_z_facto3_sched4_kwayprojections_svdbegin .........***Timeout 216.72 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1721: shm_example_simple_lap_z_facto3_sched4_kwayprojections_svdbegin 1584/3626 Test #1722: shm_example_simple_lap_z_facto3_sched4_kwayprojections_svdend ...........***Timeout 216.83 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1722: shm_example_simple_lap_z_facto3_sched4_kwayprojections_svdend 1584/3626 Test #1723: shm_example_simple_lap_z_facto3_sched4_not_pqrcpbegin ...................***Timeout 216.90 sec Start 1723: shm_example_simple_lap_z_facto3_sched4_not_pqrcpbegin 1584/3626 Test #1724: shm_example_simple_lap_z_facto3_sched4_not_pqrcpend .....................***Timeout 216.90 sec Start 1724: shm_example_simple_lap_z_facto3_sched4_not_pqrcpend 1584/3626 Test #1771: shm_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpbegin ....... Passed 196.66 sec 1585/3626 Test #1794: c_mpi_rep_example_analyze_lap_z_facto2 .................................. Passed 186.65 sec 1586/3626 Test #1807: c_mpi_rep_example_simple_lap_c_facto4 ................................... Passed 184.89 sec 1587/3626 Test #1922: mpi_rep_example_simple_lap_s_facto0_sched0_1d ........................... Passed 136.15 sec 1588/3626 Test #1931: mpi_rep_example_simple_lap_c_facto3_sched0_1d ........................... Passed 134.11 sec 1589/3626 Test #1736: shm_example_simple_lap_z_facto3_sched4_not_tqrcpend .....................***Timeout 211.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.310200e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.734114e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.522958e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 5.467944e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.337052e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.299897e-01 s Time to initialize coeftab 7.194583e-02 s Time to factorize 3.732038e+00 s ( 5.43 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 2.162453e+00 s Time for refinement 2.639148e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.009357e-16 max(|| b_i - A x_i ||_1) 2.026441e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.113399e-03 (SUCCESS) Start 1736: shm_example_simple_lap_z_facto3_sched4_not_tqrcpend 1589/3626 Test #1762: shm_example_simple_lap_z_facto4_sched4_not_rqrcpend .....................***Timeout 220.54 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.835817e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.078722e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.742889e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.495251e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.205865e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.332098e-02 s Time to initialize coeftab 9.436479e-02 s Time to factorize 5.822342e-01 s (36.60 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 3.367167e+00 s Time for refinement 1.968326e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.766474e-16 max(|| b_i - A x_i ||_1) 1.855286e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.681515e-03 (SUCCESS) Start 1762: shm_example_simple_lap_z_facto4_sched4_not_rqrcpend Start 2096: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrcpbegin Start 2097: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrcpend Start 2098: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrcpbegin Start 2099: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrcpend Start 2100: mpi_dst_example_simple_lap_s_facto2_sched0_not_tqrcpbegin 1589/3626 Test #1795: c_mpi_rep_example_analyze_lap_z_facto3 .................................. Passed 207.45 sec Test #1339: shm_example_simple_lap_s_facto2_sched4_not_pqrcpbegin ................... Passed 191.60 sec Test #1365: shm_example_simple_lap_d_facto0_sched4_not_svdbegin ..................... Passed 184.01 sec 1592/3626 Test #1726: shm_example_simple_lap_z_facto3_sched4_kway_pqrcpend ....................***Timeout 239.26 sec Start 1726: shm_example_simple_lap_z_facto3_sched4_kway_pqrcpend 1592/3626 Test #1730: shm_example_simple_lap_z_facto3_sched4_not_rqrcpend .....................***Timeout 238.96 sec Start 1730: shm_example_simple_lap_z_facto3_sched4_not_rqrcpend 1592/3626 Test #1733: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpbegin .......***Timeout 234.95 sec Start 1733: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpbegin 1592/3626 Test #1734: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpend .........***Timeout 234.24 sec Start 1734: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpend Test #1240: shm_example_simple_lap_z_facto4_sched1_kway_svdend ......................***Timeout 233.46 sec Start 1240: shm_example_simple_lap_z_facto4_sched1_kway_svdend 1592/3626 Test #1735: shm_example_simple_lap_z_facto3_sched4_not_tqrcpbegin ...................***Timeout 232.41 sec Start 1735: shm_example_simple_lap_z_facto3_sched4_not_tqrcpbegin 1592/3626 Test #1738: shm_example_simple_lap_z_facto3_sched4_kway_tqrcpend ....................***Timeout 231.89 sec Start 1738: shm_example_simple_lap_z_facto3_sched4_kway_tqrcpend 1592/3626 Test #1740: shm_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpend .........***Timeout 231.66 sec Start 1740: shm_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpend 1592/3626 Test #1741: shm_example_simple_lap_z_facto3_sched4_not_rqrrtbegin ...................***Timeout 231.45 sec Start 1741: shm_example_simple_lap_z_facto3_sched4_not_rqrrtbegin Test #1243: shm_example_simple_lap_z_facto4_sched1_not_pqrcpbegin ...................***Timeout 231.46 sec Start 1243: shm_example_simple_lap_z_facto4_sched1_not_pqrcpbegin 1592/3626 Test #1742: shm_example_simple_lap_z_facto3_sched4_not_rqrrtend .....................***Timeout 231.49 sec Start 1742: shm_example_simple_lap_z_facto3_sched4_not_rqrrtend Test #1245: shm_example_simple_lap_z_facto4_sched1_kway_pqrcpbegin ..................***Timeout 231.29 sec Start 1245: shm_example_simple_lap_z_facto4_sched1_kway_pqrcpbegin 1592/3626 Test #1744: shm_example_simple_lap_z_facto3_sched4_kway_rqrrtend ....................***Timeout 231.25 sec Start 1744: shm_example_simple_lap_z_facto3_sched4_kway_rqrrtend 1592/3626 Test #1745: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtbegin .......***Timeout 231.24 sec Start 1745: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtbegin 1592/3626 Test #1746: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtend .........***Timeout 230.79 sec Start 1746: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtend 1592/3626 Test #1747: shm_example_simple_lap_z_facto3_sched4_kway_pqrcpilu0 ...................***Timeout 230.58 sec Start 1747: shm_example_simple_lap_z_facto3_sched4_kway_pqrcpilu0 1592/3626 Test #1750: shm_example_simple_lap_z_facto4_sched4_not_svdend .......................***Timeout 230.40 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1750: shm_example_simple_lap_z_facto4_sched4_not_svdend 1592/3626 Test #1751: shm_example_simple_lap_z_facto4_sched4_kway_svdbegin ....................***Timeout 230.22 sec Start 1751: shm_example_simple_lap_z_facto4_sched4_kway_svdbegin 1592/3626 Test #1752: shm_example_simple_lap_z_facto4_sched4_kway_svdend ......................***Timeout 230.27 sec Start 1752: shm_example_simple_lap_z_facto4_sched4_kway_svdend 1592/3626 Test #1753: shm_example_simple_lap_z_facto4_sched4_kwayprojections_svdbegin .........***Timeout 230.13 sec Start 1753: shm_example_simple_lap_z_facto4_sched4_kwayprojections_svdbegin 1592/3626 Test #1755: shm_example_simple_lap_z_facto4_sched4_not_pqrcpbegin ...................***Timeout 229.45 sec Start 1755: shm_example_simple_lap_z_facto4_sched4_not_pqrcpbegin 1592/3626 Test #1756: shm_example_simple_lap_z_facto4_sched4_not_pqrcpend .....................***Timeout 229.35 sec Start 1756: shm_example_simple_lap_z_facto4_sched4_not_pqrcpend 1592/3626 Test #1757: shm_example_simple_lap_z_facto4_sched4_kway_pqrcpbegin ..................***Timeout 228.84 sec Start 1757: shm_example_simple_lap_z_facto4_sched4_kway_pqrcpbegin 1592/3626 Test #1759: shm_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpbegin .......***Timeout 227.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 1759: shm_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpbegin 1592/3626 Test #1760: shm_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpend .........***Timeout 228.02 sec Start 1760: shm_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpend 1592/3626 Test #1765: shm_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpbegin .......***Timeout 224.60 sec Start 1765: shm_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpbegin 1592/3626 Test #1766: shm_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpend .........***Timeout 224.66 sec Start 1766: shm_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpend 1592/3626 Test #1803: c_mpi_rep_example_simple_lap_c_facto0 ...................................***Timeout 233.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.466516e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.060120e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.944811e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.813896e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.714939e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.808439e-02 s Time to initialize coeftab 4.745239e-01 s Time to factorize 2.978724e+00 s ( 6.81 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.643121e+00 s Time for refinement 1.129224e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.882924e-07 max(|| b_i - A x_i ||_1) 8.500474e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.144924e+00 (SUCCESS) max(|| x_i ||_oo) 6.822264e-01 max(|| x0_i - x_i ||_oo) 3.625609e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 5.314379e-01 (SUCCESS) || A ||_1 5.112398e-02 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.882924e-07 max(|| b_i - A x_i ||_1) 8.500474e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.144924e+00 (SUCCESS) max(|| x_i ||_oo) 6.822264e-01 max(|| x0_i - x_i ||_oo) 3.625609e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 5.314379e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.882924e-07 max(|| b_i - A x_i ||_1) 8.500474e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.144924e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| x_i ||_oo) 6.822264e-01 max(|| x0_i - x_i ||_oo) 3.625609e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 5.314379e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.882924e-07 max(|| b_i - A x_i ||_1) 8.500474e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.144924e+00 (SUCCESS) max(|| x_i ||_oo) 6.822264e-01 max(|| x0_i - x_i ||_oo) 3.625609e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 5.314379e-01 (SUCCESS) Start 1803: c_mpi_rep_example_simple_lap_c_facto0 1592/3626 Test #1810: c_mpi_rep_example_simple_lap_z_facto2 ...................................***Timeout 240.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.122826e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.664139e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.240325e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.201965e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.051597e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.214165e-03 s Time to initialize coeftab 1.785430e+00 s Time to factorize 6.943903e+00 s ( 5.76 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Memory usage of coeftab 548 Ko Time to solve 3.478345e+00 s Time for refinement 2.497714e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.025630e-16 max(|| b_i - A x_i ||_1) 1.595095e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.024966e-03 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.025630e-16 max(|| b_i - A x_i ||_1) 1.595095e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.024966e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.025630e-16 max(|| b_i - A x_i ||_1) 1.595095e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.024966e-03 (SUCCESS) max(|| x0_i - x_i ||_oo) 1.217455e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.784532e-03 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 1.217455e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.784532e-03 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 1.217455e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.784532e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.025630e-16 max(|| b_i - A x_i ||_1) 1.595095e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.024966e-03 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 1.217455e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.784532e-03 (SUCCESS) Start 1810: c_mpi_rep_example_simple_lap_z_facto2 1592/3626 Test #1823: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto4 ..................***Timeout 249.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.255906e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.520537e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.003400e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.124879e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.047746e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.109374e-03 s Time to initialize coeftab 5.323146e-01 s Time to factorize 1.666781e+00 s (12.78 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 4.403342e+00 s Time for refinement 3.244487e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.815038e-07 max(|| b_i - A x_i ||_1) 8.002532e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.019278e+00 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112398e-02 max(|| x0_i - x_i ||_oo) 4.558879e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 6.682355e-01 (SUCCESS) max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.815038e-07 max(|| b_i - A x_i ||_1) 8.002532e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.019278e+00 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 4.558879e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 6.682355e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.815038e-07 max(|| b_i - A x_i ||_1) 8.002532e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.019278e+00 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 4.558879e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 6.682355e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.815038e-07 max(|| b_i - A x_i ||_1) 8.002532e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.019278e+00 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 4.558879e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 6.682355e-01 (SUCCESS) Start 1823: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto4 Test #1307: shm_example_simple_lap_s_facto1_sched4_not_pqrcpbegin ...................***Timeout 270.71 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.187883e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.262351e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.539788e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.785490e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.674748e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.481710e-02 s Time to initialize coeftab 1.277984e-01 s Time to factorize 3.264533e+00 s ( 1.60 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 2.376142e+00 s Time for refinement 6.113470e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.423685e-07 max(|| b_i - A x_i ||_1) 1.401444e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.761005e+00 (SUCCESS) Start 1307: shm_example_simple_lap_s_facto1_sched4_not_pqrcpbegin Test #1309: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpbegin ..................***Timeout 270.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.677991e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.993568e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.766287e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.057354e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.711222e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.446955e-02 s Time to initialize coeftab 1.507595e-01 s Time to factorize 1.477455e+00 s ( 3.54 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 8.957955e-01 s Time for refinement 7.571646e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.371208e-07 max(|| b_i - A x_i ||_1) 1.374837e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.727572e+00 (SUCCESS) Start 1309: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpbegin Test #1317: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpbegin .......***Timeout 271.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.370399e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.993620e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.087471e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.345992e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.411129e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.754939e-02 s Time to initialize coeftab 8.046983e-01 s Time to factorize 1.837573e+00 s ( 2.85 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 1.76 Ko Outside 2.11 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 191 Ko / 191 Ko ------------------------------------------------ Total 319 Ko / 319 Ko Time to solve 2.067060e+00 s - iteration 1 : total iteration time 0.94 s error 2.7867e-11 Time for refinement 2.043233e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.531046e-08 max(|| b_i - A x_i ||_1) 2.772703e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.484082e-01 (SUCCESS) Start 1317: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpbegin Start 2101: mpi_dst_example_simple_lap_s_facto2_sched0_not_tqrcpend Start 2102: mpi_dst_example_simple_lap_s_facto2_sched0_kway_tqrcpbegin Start 2103: mpi_dst_example_simple_lap_s_facto2_sched0_kway_tqrcpend 1592/3626 Test #1879: c_mpi_rep_example_simple_scotch_hb ...................................... Passed 238.24 sec 1593/3626 Test #1883: c_mpi_rep_example_simple_single_hb ...................................... Passed 238.08 sec 1594/3626 Test #1937: mpi_rep_example_simple_lap_z_facto4_sched0_1d ........................... Passed 226.07 sec 1595/3626 Test #1865: c_mpi_rep_example_personal_lap_d_facto1 ................................. Passed 239.49 sec 1596/3626 Test #1920: c_mpi_rep_example_simple_mixed_lap_z_refine_gmres_sym ................... Passed 234.81 sec Test #1358: shm_example_simple_lap_s_facto2_sched4_not_rqrrtend ..................... Passed 267.21 sec Test #1391: shm_example_simple_lap_d_facto0_sched4_kway_rqrrtbegin .................. Passed 246.05 sec Test #1439: shm_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpbegin ....... Passed 243.46 sec 1600/3626 Test #1950: mpi_rep_example_simple_lap_z_facto1_sched1_1d ........................... Passed 219.66 sec 1601/3626 Test #1863: c_mpi_rep_example_personal_lap_s_facto2 ................................. Passed 239.60 sec 1602/3626 Test #1891: c_mpi_rep_example_simple_refine_bicgstab ................................ Passed 237.78 sec 1603/3626 Test #1933: mpi_rep_example_simple_lap_z_facto0_sched0_1d ........................... Passed 231.13 sec Test #1485: shm_example_simple_lap_c_facto0_sched4_not_rqrrtbegin ................... Passed 241.53 sec 1605/3626 Test #1934: mpi_rep_example_simple_lap_z_facto1_sched0_1d ........................... Passed 230.62 sec Test #1457: shm_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtbegin ....... Passed 242.27 sec Test #1373: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpbegin .................. Passed 246.71 sec Test #1401: shm_example_simple_lap_d_facto1_sched4_kwayprojections_svdbegin ......... Passed 245.04 sec Test #1346: shm_example_simple_lap_s_facto2_sched4_not_rqrcpend ..................... Passed 267.30 sec 1610/3626 Test #1882: c_mpi_rep_example_simple_single_mm ...................................... Passed 238.22 sec Test #1392: shm_example_simple_lap_d_facto0_sched4_kway_rqrrtend .................... Passed 245.98 sec Test #1418: shm_example_simple_lap_d_facto1_sched4_kway_tqrcpend .................... Passed 244.47 sec Test #1419: shm_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpbegin ....... Passed 244.44 sec Test #1472: shm_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpend ......... Passed 241.96 sec Test #1395: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpilu0 ................... Passed 245.75 sec Test #1424: shm_example_simple_lap_d_facto1_sched4_kway_rqrrtend .................... Passed 244.17 sec Test #1448: shm_example_simple_lap_d_facto2_sched4_not_tqrcpend ..................... Passed 243.43 sec 1618/3626 Test #1769: shm_example_simple_lap_z_facto4_sched4_kway_tqrcpbegin ..................***Timeout 298.61 sec Start 1769: shm_example_simple_lap_z_facto4_sched4_kway_tqrcpbegin 1618/3626 Test #1772: shm_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpend .........***Timeout 295.40 sec Start 1772: shm_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpend 1618/3626 Test #1773: shm_example_simple_lap_z_facto4_sched4_not_rqrrtbegin ...................***Timeout 295.44 sec Start 1773: shm_example_simple_lap_z_facto4_sched4_not_rqrrtbegin 1618/3626 Test #1780: shm_example_simple_lap_z_facto4_sched4_kway_pqrcpilu1 ...................***Timeout 294.59 sec Start 1780: shm_example_simple_lap_z_facto4_sched4_kway_pqrcpilu1 1618/3626 Test #1782: c_mpi_rep_example_analyze_lap_s_facto1 ..................................***Timeout 293.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1782: c_mpi_rep_example_analyze_lap_s_facto1 1618/3626 Test #1783: c_mpi_rep_example_analyze_lap_s_facto2 ..................................***Timeout 293.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1783: c_mpi_rep_example_analyze_lap_s_facto2 1618/3626 Test #1785: c_mpi_rep_example_analyze_lap_d_facto1 ..................................***Timeout 291.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1785: c_mpi_rep_example_analyze_lap_d_facto1 1618/3626 Test #1790: c_mpi_rep_example_analyze_lap_c_facto3 ..................................***Timeout 287.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1790: c_mpi_rep_example_analyze_lap_c_facto3 1618/3626 Test #1792: c_mpi_rep_example_analyze_lap_z_facto0 ..................................***Timeout 287.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1792: c_mpi_rep_example_analyze_lap_z_facto0 1618/3626 Test #1796: c_mpi_rep_example_analyze_lap_z_facto4 ..................................***Timeout 286.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1796: c_mpi_rep_example_analyze_lap_z_facto4 1618/3626 Test #1797: c_mpi_rep_example_simple_lap_s_facto0 ...................................***Timeout 286.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1797: c_mpi_rep_example_simple_lap_s_facto0 1618/3626 Test #1799: c_mpi_rep_example_simple_lap_s_facto2 ...................................***Timeout 285.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1799: c_mpi_rep_example_simple_lap_s_facto2 Test #1289: shm_example_simple_lap_s_facto0_sched4_kway_tqrcpbegin ..................***Timeout 285.78 sec Start 1289: shm_example_simple_lap_s_facto0_sched4_kway_tqrcpbegin 1618/3626 Test #1800: c_mpi_rep_example_simple_lap_d_facto0 ...................................***Timeout 285.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1800: c_mpi_rep_example_simple_lap_d_facto0 1618/3626 Test #1802: c_mpi_rep_example_simple_lap_d_facto2 ...................................***Timeout 285.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1802: c_mpi_rep_example_simple_lap_d_facto2 1618/3626 Test #1804: c_mpi_rep_example_simple_lap_c_facto1 ...................................***Timeout 285.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1804: c_mpi_rep_example_simple_lap_c_facto1 1618/3626 Test #1805: c_mpi_rep_example_simple_lap_c_facto2 ...................................***Timeout 285.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1805: c_mpi_rep_example_simple_lap_c_facto2 1618/3626 Test #1808: c_mpi_rep_example_simple_lap_z_facto0 ...................................***Timeout 285.60 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1808: c_mpi_rep_example_simple_lap_z_facto0 1618/3626 Test #1811: c_mpi_rep_example_simple_lap_z_facto3 ...................................***Timeout 284.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1811: c_mpi_rep_example_simple_lap_z_facto3 1618/3626 Test #1812: c_mpi_rep_example_simple_lap_z_facto4 ...................................***Timeout 283.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1812: c_mpi_rep_example_simple_lap_z_facto4 1618/3626 Test #1815: c_mpi_rep_example_simple_solve_and_refine_lap_s_facto2 ..................***Timeout 282.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1815: c_mpi_rep_example_simple_solve_and_refine_lap_s_facto2 1618/3626 Test #1816: c_mpi_rep_example_simple_solve_and_refine_lap_d_facto0 ..................***Timeout 283.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1816: c_mpi_rep_example_simple_solve_and_refine_lap_d_facto0 1618/3626 Test #1818: c_mpi_rep_example_simple_solve_and_refine_lap_d_facto2 ..................***Timeout 283.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1818: c_mpi_rep_example_simple_solve_and_refine_lap_d_facto2 1618/3626 Test #1819: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto0 ..................***Timeout 283.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1819: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto0 1618/3626 Test #1820: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto1 ..................***Timeout 283.17 sec Start 1820: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto1 1618/3626 Test #1821: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto2 ..................***Timeout 283.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1821: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto2 1618/3626 Test #1822: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto3 ..................***Timeout 283.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1822: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto3 1618/3626 Test #1825: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto1 ..................***Timeout 282.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1825: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto1 1618/3626 Test #1826: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto2 ..................***Timeout 282.04 sec Start 1826: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto2 1618/3626 Test #1827: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto3 ..................***Timeout 282.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1827: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto3 1618/3626 Test #1830: c_mpi_rep_example_simple_trans_lap_s_facto1 .............................***Timeout 282.15 sec Start 1830: c_mpi_rep_example_simple_trans_lap_s_facto1 1618/3626 Test #1831: c_mpi_rep_example_simple_trans_lap_s_facto2 .............................***Timeout 282.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1831: c_mpi_rep_example_simple_trans_lap_s_facto2 1618/3626 Test #1832: c_mpi_rep_example_simple_trans_lap_d_facto0 .............................***Timeout 282.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.320855e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.975620e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.444045e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.275677e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.981962e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.682988e-02 s Time to initialize coeftab 2.885999e-01 s Time to factorize 1.368261e+00 s ( 3.70 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 3.649463e+00 s Start 1832: c_mpi_rep_example_simple_trans_lap_d_facto0 1618/3626 Test #1833: c_mpi_rep_example_simple_trans_lap_d_facto1 .............................***Timeout 282.43 sec Start 1833: c_mpi_rep_example_simple_trans_lap_d_facto1 1618/3626 Test #1834: c_mpi_rep_example_simple_trans_lap_d_facto2 .............................***Timeout 282.44 sec Start 1834: c_mpi_rep_example_simple_trans_lap_d_facto2 1618/3626 Test #1835: c_mpi_rep_example_simple_trans_lap_c_facto0 .............................***Timeout 282.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1835: c_mpi_rep_example_simple_trans_lap_c_facto0 1618/3626 Test #1836: c_mpi_rep_example_simple_trans_lap_c_facto1 .............................***Timeout 282.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1836: c_mpi_rep_example_simple_trans_lap_c_facto1 1618/3626 Test #1838: c_mpi_rep_example_simple_trans_lap_c_facto3 .............................***Timeout 281.87 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1838: c_mpi_rep_example_simple_trans_lap_c_facto3 1618/3626 Test #1842: c_mpi_rep_example_simple_trans_lap_z_facto2 .............................***Timeout 281.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1842: c_mpi_rep_example_simple_trans_lap_z_facto2 1618/3626 Test #1844: c_mpi_rep_example_simple_trans_lap_z_facto4 .............................***Timeout 280.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1844: c_mpi_rep_example_simple_trans_lap_z_facto4 Test #1310: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpend ....................***Timeout 278.34 sec Start 1310: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpend Test #1313: shm_example_simple_lap_s_facto1_sched4_not_rqrcpbegin ...................***Timeout 278.32 sec Start 1313: shm_example_simple_lap_s_facto1_sched4_not_rqrcpbegin Test #1316: shm_example_simple_lap_s_facto1_sched4_kway_rqrcpend ....................***Timeout 278.32 sec Start 1316: shm_example_simple_lap_s_facto1_sched4_kway_rqrcpend Test #1323: shm_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpbegin .......***Timeout 278.15 sec Start 1323: shm_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpbegin Test #1325: shm_example_simple_lap_s_facto1_sched4_not_rqrrtbegin ...................***Timeout 278.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1325: shm_example_simple_lap_s_facto1_sched4_not_rqrrtbegin Test #1327: shm_example_simple_lap_s_facto1_sched4_kway_rqrrtbegin ..................***Timeout 278.01 sec Start 1327: shm_example_simple_lap_s_facto1_sched4_kway_rqrrtbegin Test #1329: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtbegin .......***Timeout 277.97 sec Start 1329: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtbegin Test #1330: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtend .........***Timeout 277.95 sec Start 1330: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtend Test #1331: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpilu0 ...................***Timeout 277.94 sec Start 1331: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpilu0 Test #1326: shm_example_simple_lap_s_facto1_sched4_not_rqrrtend .....................***Timeout 278.13 sec Start 1326: shm_example_simple_lap_s_facto1_sched4_not_rqrrtend Test #1332: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpilu1 ...................***Timeout 277.91 sec Start 1332: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpilu1 Test #1335: shm_example_simple_lap_s_facto2_sched4_kway_svdbegin ....................***Timeout 275.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1335: shm_example_simple_lap_s_facto2_sched4_kway_svdbegin Test #1349: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpbegin .......***Timeout 274.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.205468e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.326963e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.291419e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 7.136945e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.503450e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.169375e-01 s Time to initialize coeftab 2.523509e-01 s Time to factorize 3.691685e+00 s ( 2.70 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 1.416037e+00 s Time for refinement 9.383184e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.610671e-07 max(|| b_i - A x_i ||_1) 1.287419e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.617725e+00 (SUCCESS) Start 1349: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpbegin Test #1336: shm_example_simple_lap_s_facto2_sched4_kway_svdend ......................***Timeout 274.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.910163e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.287911e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.019288e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.435248e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.971964e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.410882e-01 s Time to initialize coeftab 1.505080e-01 s Time to factorize 1.465154e+00 s ( 6.81 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 2.853124e+00 s Time for refinement 7.959786e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.974793e-07 max(|| b_i - A x_i ||_1) 8.503255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.068489e+00 (SUCCESS) Start 1336: shm_example_simple_lap_s_facto2_sched4_kway_svdend Test #1340: shm_example_simple_lap_s_facto2_sched4_not_pqrcpend .....................***Timeout 274.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.047241e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.050894e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.645618e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.957330e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.086199e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.092720e-01 s Time to initialize coeftab 1.317552e-01 s Time to factorize 2.914408e+00 s ( 3.43 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 2.618639e+00 s Time for refinement 4.148754e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.905630e-07 max(|| b_i - A x_i ||_1) 8.213767e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.032113e+00 (SUCCESS) Start 1340: shm_example_simple_lap_s_facto2_sched4_not_pqrcpend 1618/3626 Test #1855: c_mpi_rep_example_step-by-step_lap_c_facto4 .............................***Timeout 258.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.553037e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.842420e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.111462e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.151356e-02 s Time to initialize internal csc 5.343891e-03 s Time to initialize coeftab 6.044930e-02 s Time to factorize 8.163651e-01 s (26.10 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.296053e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 9.916344e+00 s || A ||_1 5.112398e-02 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.834695e-07 max(|| b_i - A x_i ||_1) 8.161138e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.062945e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.834695e-07 max(|| b_i - A x_i ||_1) 8.161138e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.062945e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.834695e-07 max(|| b_i - A x_i ||_1) 8.161138e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.062945e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.834695e-07 max(|| b_i - A x_i ||_1) 8.161138e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.062945e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 5.068072e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277820e-01 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 5.068072e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277820e-01 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 5.068072e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277820e-01 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 5.068072e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277820e-01 (SUCCESS) Time to solve 7.451945e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 1.983302e+01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029081e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.840356e-07 max(|| b_i - A x_i ||_1) 8.217444e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.072668e+00 (SUCCESS) max(|| x_i ||_oo) 7.029081e-01 max(|| x0_i - x_i ||_oo) 5.067627e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277182e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029081e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029081e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.840356e-07 max(|| b_i - A x_i ||_1) 8.217444e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.072668e+00 (SUCCESS) max(|| x_i ||_oo) 7.029081e-01 max(|| x0_i - x_i ||_oo) 5.067627e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277182e-01 (SUCCESS) Time to initialize internal csc 4.281370e-04 s max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.840356e-07 max(|| b_i - A x_i ||_1) 8.217444e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.072668e+00 (SUCCESS) max(|| x_i ||_oo) 7.029081e-01 max(|| x0_i - x_i ||_oo) 5.067627e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277182e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029081e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.840356e-07 max(|| b_i - A x_i ||_1) 8.217444e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.072668e+00 (SUCCESS) max(|| x_i ||_oo) 7.029081e-01 max(|| x0_i - x_i ||_oo) 5.067627e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277182e-01 (SUCCESS) Time to initialize coeftab 2.555368e-01 s Time to factorize 2.113780e+00 s (10.08 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.590959e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 2.867688e+01 s || A ||_1 5.112398e-02 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029081e-01 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029081e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029081e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.852147e-07 max(|| b_i - A x_i ||_1) 8.198256e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.074835e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.852147e-07 max(|| b_i - A x_i ||_1) 8.198256e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.074835e+00 (SUCCESS) max(|| x_i ||_oo) 7.029081e-01 max(|| x0_i - x_i ||_oo) 5.067627e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277182e-01 (SUCCESS) max(|| x_i ||_oo) 7.029081e-01 max(|| x0_i - x_i ||_oo) 5.067627e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277182e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.852147e-07 max(|| b_i - A x_i ||_1) 8.198256e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.074835e+00 (SUCCESS) max(|| x_i ||_oo) 7.029081e-01 max(|| x0_i - x_i ||_oo) 5.067627e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277182e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029081e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.852147e-07 max(|| b_i - A x_i ||_1) 8.198256e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.074835e+00 (SUCCESS) max(|| x_i ||_oo) 7.029081e-01 max(|| x0_i - x_i ||_oo) 5.067627e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277182e-01 (SUCCESS) Time to solve 4.256416e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 2.233291e+01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.856451e-07 max(|| b_i - A x_i ||_1) 8.158153e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.076766e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.856451e-07 max(|| b_i - A x_i ||_1) 8.158153e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.076766e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.856451e-07 max(|| b_i - A x_i ||_1) 8.158153e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.076766e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 5.067627e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277182e-01 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 5.067627e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277182e-01 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 5.067627e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277182e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.764572e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.856451e-07 max(|| b_i - A x_i ||_1) 8.158153e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.076766e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 5.067627e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.277182e-01 (SUCCESS) Start 1855: c_mpi_rep_example_step-by-step_lap_c_facto4 Test #1362: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtend .........***Timeout 259.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.243922e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.345886e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.405593e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.207273e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.418860e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.817239e-02 s Time to initialize coeftab 1.155219e-01 s Time to factorize 1.588442e+00 s ( 6.29 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 1.468207e+00 s Time for refinement 1.214037e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.948176e-07 max(|| b_i - A x_i ||_1) 8.326995e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.046341e+00 (SUCCESS) Start 1362: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtend Test #1363: shm_example_simple_lap_s_facto2_sched4_kway_pqrcpilu0 ...................***Timeout 259.47 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.397084e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.504363e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.246390e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.321531e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.659209e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.332038e-02 s Time to initialize coeftab 2.835352e-01 s Time to factorize 3.497438e+00 s ( 2.85 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 124 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 514 Ko / 514 Ko Time to solve 1.227594e+00 s Time for refinement 1.478553e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.317529e-07 max(|| b_i - A x_i ||_1) 1.008469e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.267207e+00 (SUCCESS) Start 1363: shm_example_simple_lap_s_facto2_sched4_kway_pqrcpilu0 Test #1366: shm_example_simple_lap_d_facto0_sched4_not_svdend .......................***Timeout 259.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.398949e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.377987e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.430391e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 5.287246e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.494895e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.112781e-01 s Time to initialize coeftab 8.545294e-02 s Time to factorize 2.431409e+00 s ( 2.08 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.545209e+00 s Time for refinement 4.089264e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.656457e-16 max(|| b_i - A x_i ||_1) 1.933618e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.429757e-03 (SUCCESS) Start 1366: shm_example_simple_lap_d_facto0_sched4_not_svdend Test #1367: shm_example_simple_lap_d_facto0_sched4_kway_svdbegin ....................***Timeout 259.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.437647e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.927408e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.660550e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 7.043910e-02 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.453547e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.947787e-03 s Time to initialize coeftab 1.068553e-01 s Time to factorize 2.744446e+00 s ( 1.84 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.215090e+00 s - iteration 1 : total iteration time 2.18 s error 1.4593e-14 Time for refinement 3.350268e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.459407e-14 max(|| b_i - A x_i ||_1) 2.755601e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.462649e-02 (SUCCESS) Start 1367: shm_example_simple_lap_d_facto0_sched4_kway_svdbegin Test #1380: shm_example_simple_lap_d_facto0_sched4_kway_rqrcpend ....................***Timeout 271.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.509260e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.943464e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.204076e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.164305e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.527454e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.260178e-03 s Time to initialize coeftab 3.637515e-02 s Time to factorize 1.371811e+00 s ( 3.69 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.360731e+00 s Time for refinement 7.679989e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.685881e-16 max(|| b_i - A x_i ||_1) 1.961034e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.464208e-03 (SUCCESS) Start 1380: shm_example_simple_lap_d_facto0_sched4_kway_rqrcpend Test #1385: shm_example_simple_lap_d_facto0_sched4_kway_tqrcpbegin ..................***Timeout 271.92 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.073062e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.037976e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.871357e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.410402e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.152129e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.299997e-02 s Time to initialize coeftab 4.288196e-01 s Time to factorize 4.938718e+00 s ( 1.03 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.066690e+00 s - iteration 1 : total iteration time 7.11 s error 5.8721e-14 Time for refinement 1.281999e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.871773e-14 max(|| b_i - A x_i ||_1) 9.152661e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.150110e-01 (SUCCESS) Start 1385: shm_example_simple_lap_d_facto0_sched4_kway_tqrcpbegin Test #1394: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtend .........***Timeout 271.60 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.216560e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.205515e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.434748e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.064344e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.263119e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.373616e-02 s Time to initialize coeftab 9.301410e-02 s Time to factorize 2.817744e+00 s ( 1.80 MFlop/s) Number of operations 6.47 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 7.524094e-01 s Time for refinement 3.578121e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.682970e-16 max(|| b_i - A x_i ||_1) 1.940794e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.438775e-03 (SUCCESS) Start 1394: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtend Test #1398: shm_example_simple_lap_d_facto1_sched4_not_svdend .......................***Timeout 271.31 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.492463e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.304877e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.187463e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.113451e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.991514e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.974657e-01 s Time to initialize coeftab 8.941442e-02 s Time to factorize 1.095017e+00 s ( 4.78 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.053682e+00 s Time for refinement 8.276909e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.635814e-16 max(|| b_i - A x_i ||_1) 1.874792e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.355837e-03 (SUCCESS) Start 1398: shm_example_simple_lap_d_facto1_sched4_not_svdend Test #1400: shm_example_simple_lap_d_facto1_sched4_kway_svdend ......................***Timeout 271.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.082874e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.563983e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.235656e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.814225e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.290671e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.836490e-01 s Time to initialize coeftab 1.505387e-01 s Time to factorize 1.098872e+00 s ( 4.76 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.847580e+00 s Time for refinement 9.755039e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.535960e-16 max(|| b_i - A x_i ||_1) 1.853472e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.329047e-03 (SUCCESS) Start 1400: shm_example_simple_lap_d_facto1_sched4_kway_svdend Test #1043: shm_example_simple_lap_c_facto2_sched1_kway_pqrcpilu0 ...................***Timeout 270.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.179875e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.280559e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.947944e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.050345e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.240312e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.260578e-02 s Time to initialize coeftab 1.189457e-01 s Time to factorize 1.829864e+00 s (21.84 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 7.718677e-01 s Time for refinement 5.695144e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.029655e-07 max(|| b_i - A x_i ||_1) 1.113539e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.809793e+00 (SUCCESS) Test #1413: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpbegin .......***Timeout 270.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.405137e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.883287e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.420050e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.677530e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.814916e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.144315e-02 s Time to initialize coeftab 1.098903e+00 s Time to factorize 3.979081e+00 s ( 1.32 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.485636e+00 s - iteration 1 : total iteration time 1.43 s error 5.9542e-14 Time for refinement 1.983503e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.954036e-14 max(|| b_i - A x_i ||_1) 9.528920e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.197391e-01 (SUCCESS) Start 1413: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpbegin Test #1414: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpend .........***Timeout 271.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.378677e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.100403e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.884215e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.949718e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.988567e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.691404e-02 s Time to initialize coeftab 7.584084e-02 s Time to factorize 8.340157e-01 s ( 6.28 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 9.970086e-01 s Time for refinement 3.480465e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.544436e-16 max(|| b_i - A x_i ||_1) 1.840544e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.312801e-03 (SUCCESS) Start 1414: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpend Test #1416: shm_example_simple_lap_d_facto1_sched4_not_tqrcpend .....................***Timeout 271.47 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.553483e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.205538e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.071316e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.347696e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.979685e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.222218e-02 s Time to initialize coeftab 3.259962e-02 s Time to factorize 1.063701e+00 s ( 4.92 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.049275e+00 s Time for refinement 3.759111e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.604198e-16 max(|| b_i - A x_i ||_1) 1.866741e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.345721e-03 (SUCCESS) Start 1416: shm_example_simple_lap_d_facto1_sched4_not_tqrcpend Test #1417: shm_example_simple_lap_d_facto1_sched4_kway_tqrcpbegin ..................***Timeout 271.80 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.082231e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.647545e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.427187e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.017055e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.170674e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.307767e-02 s Time to initialize coeftab 4.223069e-01 s Time to factorize 3.528254e+00 s ( 1.48 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.864018e+00 s - iteration 1 : total iteration time 2.74 s error 3.0843e-14 Time for refinement 7.399314e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.084331e-14 max(|| b_i - A x_i ||_1) 5.347486e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.719575e-02 (SUCCESS) Start 1417: shm_example_simple_lap_d_facto1_sched4_kway_tqrcpbegin Test #1420: shm_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpend .........***Timeout 271.67 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.172805e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.428352e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.680647e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.613234e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.895254e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.011488e-02 s Time to initialize coeftab 2.789690e-02 s Time to factorize 1.002720e+00 s ( 5.22 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.380356e+00 s Time for refinement 1.312499e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.570613e-16 max(|| b_i - A x_i ||_1) 1.849754e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.324375e-03 (SUCCESS) Start 1420: shm_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpend Test #1421: shm_example_simple_lap_d_facto1_sched4_not_rqrrtbegin ...................***Timeout 271.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.341469e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.753192e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.959510e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.899251e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.738444e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.721691e-02 s Time to initialize coeftab 4.640623e-01 s Time to factorize 2.140716e+00 s ( 2.44 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 4.806172e+00 s - iteration 1 : total iteration time 8.91 s error 4.8316e-13 Time for refinement 1.278486e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.831633e-13 max(|| b_i - A x_i ||_1) 1.069520e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.343944e+00 (SUCCESS) Start 1421: shm_example_simple_lap_d_facto1_sched4_not_rqrrtbegin Test #1423: shm_example_simple_lap_d_facto1_sched4_kway_rqrrtbegin ..................***Timeout 271.71 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.026926e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.528104e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.811732e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.003362e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.412235e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.846367e-02 s Time to initialize coeftab 2.490671e-01 s Time to factorize 3.100893e+00 s ( 1.69 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.662993e+00 s - iteration 1 : total iteration time 2.12 s error 3.3338e-12 - iteration 2 : total iteration time 2.6 s error 7.3592e-18 Time for refinement 1.147201e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.284254e-16 max(|| b_i - A x_i ||_1) 6.343604e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.971283e-04 (SUCCESS) Start 1423: shm_example_simple_lap_d_facto1_sched4_kway_rqrrtbegin Test #1425: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtbegin .......***Timeout 271.64 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.397223e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.651518e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.129220e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.129553e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.434668e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.501267e-03 s Time to initialize coeftab 3.454542e-01 s Time to factorize 3.666555e+00 s ( 1.43 MFlop/s) Number of operations 6.64 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.271076e+00 s - iteration 1 : total iteration time 1.8 s error 4.5258e-13 Time for refinement 3.707833e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.525790e-13 max(|| b_i - A x_i ||_1) 5.328894e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.696213e-01 (SUCCESS) Start 1425: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtbegin Test #1429: shm_example_simple_lap_d_facto2_sched4_not_svdbegin .....................***Timeout 272.77 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.944155e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.174857e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.678693e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.458872e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.101502e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.306328e-03 s Time to initialize coeftab 3.219288e+00 s Time to factorize 3.318173e+00 s ( 3.01 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 3.963729e+00 s - iteration 1 : total iteration time 6.2 s error 1.4091e-14 Time for refinement 7.638560e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.408962e-14 max(|| b_i - A x_i ||_1) 2.163330e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.718410e-02 (SUCCESS) Start 1429: shm_example_simple_lap_d_facto2_sched4_not_svdbegin Test #1430: shm_example_simple_lap_d_facto2_sched4_not_svdend .......................***Timeout 272.77 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.781437e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.885683e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.550639e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.078905e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.529644e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.921832e-02 s Time to initialize coeftab 9.068757e-02 s Time to factorize 1.210013e+00 s ( 8.25 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 2.136672e+00 s Time for refinement 1.369461e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.695000e-16 max(|| b_i - A x_i ||_1) 1.815085e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.280810e-03 (SUCCESS) Start 1430: shm_example_simple_lap_d_facto2_sched4_not_svdend Test #1431: shm_example_simple_lap_d_facto2_sched4_kway_svdbegin ....................***Timeout 272.74 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.284413e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.025562e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.136577e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.805800e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.866954e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.875307e-02 s Time to initialize coeftab 7.686808e-01 s Time to factorize 5.664133e+00 s ( 1.76 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 7.412300e-01 s - iteration 1 : total iteration time 7.21 s error 1.031e-14 Time for refinement 1.156638e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.031116e-14 max(|| b_i - A x_i ||_1) 1.869175e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.348779e-02 (SUCCESS) Start 1431: shm_example_simple_lap_d_facto2_sched4_kway_svdbegin Test #1432: shm_example_simple_lap_d_facto2_sched4_kway_svdend ......................***Timeout 272.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.017520e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.440233e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.605833e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.094359e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.999161e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.426531e-02 s Time to initialize coeftab 4.596710e-02 s Time to factorize 1.106921e+00 s ( 9.02 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.459242e+00 s Time for refinement 2.178704e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.594738e-16 max(|| b_i - A x_i ||_1) 1.795890e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.256690e-03 (SUCCESS) Start 1432: shm_example_simple_lap_d_facto2_sched4_kway_svdend Test #1433: shm_example_simple_lap_d_facto2_sched4_kwayprojections_svdbegin .........***Timeout 272.73 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.521629e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.602107e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.848254e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.330823e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.121783e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.430965e-02 s Time to initialize coeftab 2.503343e-01 s Time to factorize 3.754595e+00 s ( 2.66 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 2.850172e+00 s - iteration 1 : total iteration time 7.98 s error 1.2094e-14 Time for refinement 1.603650e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.209328e-14 max(|| b_i - A x_i ||_1) 2.225021e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.795930e-02 (SUCCESS) Start 1433: shm_example_simple_lap_d_facto2_sched4_kwayprojections_svdbegin Test #1434: shm_example_simple_lap_d_facto2_sched4_kwayprojections_svdend ...........***Timeout 272.74 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.064572e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.238804e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.680966e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.149273e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.095582e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.606841e-03 s Time to initialize coeftab 4.797990e-02 s Time to factorize 2.211523e+00 s ( 4.51 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 3.381679e+00 s Time for refinement 9.010603e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.670952e-16 max(|| b_i - A x_i ||_1) 1.823377e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.291230e-03 (SUCCESS) Start 1434: shm_example_simple_lap_d_facto2_sched4_kwayprojections_svdend Test #1436: shm_example_simple_lap_d_facto2_sched4_not_pqrcpend .....................***Timeout 272.66 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.198677e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.252589e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.129287e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.878023e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.251600e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.352940e-02 s Time to initialize coeftab 3.141236e+00 s Time to factorize 2.428889e+00 s ( 4.11 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.537416e+00 s Time for refinement 1.851837e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.675857e-16 max(|| b_i - A x_i ||_1) 1.815701e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.281584e-03 (SUCCESS) Start 1436: shm_example_simple_lap_d_facto2_sched4_not_pqrcpend Test #1438: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpend ....................***Timeout 272.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.369344e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.268175e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.193813e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.269754e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.396662e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.473766e-03 s Time to initialize coeftab 1.112943e+00 s Time to factorize 2.605711e+00 s ( 3.83 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.751747e+00 s Time for refinement 8.768291e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.600734e-16 max(|| b_i - A x_i ||_1) 1.799640e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.261402e-03 (SUCCESS) Start 1438: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpend Test #1441: shm_example_simple_lap_d_facto2_sched4_not_rqrcpbegin ...................***Timeout 273.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.660122e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.216705e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.300522e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.398160e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.061391e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.006134e-02 s Time to initialize coeftab 1.031727e+00 s Time to factorize 3.196686e+00 s ( 3.12 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 9.161195e-01 s - iteration 1 : total iteration time 1.52 s error 1.8572e-14 Time for refinement 3.916487e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.857487e-14 max(|| b_i - A x_i ||_1) 3.401523e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.274306e-02 (SUCCESS) Start 1441: shm_example_simple_lap_d_facto2_sched4_not_rqrcpbegin Test #1442: shm_example_simple_lap_d_facto2_sched4_not_rqrcpend .....................***Timeout 273.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.425621e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.841020e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.740062e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.890234e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.466235e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.369901e-03 s Time to initialize coeftab 1.182382e-01 s Time to factorize 2.001480e+00 s ( 4.99 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 3.324663e+00 s Time for refinement 6.676725e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.735314e-16 max(|| b_i - A x_i ||_1) 1.824358e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.292463e-03 (SUCCESS) Start 1442: shm_example_simple_lap_d_facto2_sched4_not_rqrcpend Test #1444: shm_example_simple_lap_d_facto2_sched4_kway_rqrcpend ....................***Timeout 273.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.264885e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.180857e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.287354e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 9.168399e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.391422e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.318393e-01 s Time to initialize coeftab 2.323348e-01 s Time to factorize 1.582226e+00 s ( 6.31 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 2.282402e+00 s Time for refinement 8.135178e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.700843e-16 max(|| b_i - A x_i ||_1) 1.806242e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.269698e-03 (SUCCESS) Start 1444: shm_example_simple_lap_d_facto2_sched4_kway_rqrcpend Test #1447: shm_example_simple_lap_d_facto2_sched4_not_tqrcpbegin ...................***Timeout 273.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.886413e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.929348e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.656952e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.018851e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.631178e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.313928e-03 s Time to initialize coeftab 3.112264e-01 s Time to factorize 5.669616e+00 s ( 1.76 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 763 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 7.871352e-01 s - iteration 1 : total iteration time 6.68 s error 2.9382e-14 Time for refinement 1.152656e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.937832e-14 max(|| b_i - A x_i ||_1) 4.855090e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.100838e-02 (SUCCESS) Start 1447: shm_example_simple_lap_d_facto2_sched4_not_tqrcpbegin Test #1451: shm_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpbegin .......***Timeout 272.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.146970e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.661449e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.356205e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.202528e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.191594e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.656192e-02 s Time to initialize coeftab 7.146127e-01 s Time to factorize 5.595194e+00 s ( 1.78 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 4.313898e+00 s - iteration 1 : total iteration time 2.16 s error 3.269e-14 Time for refinement 4.425727e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.268436e-14 max(|| b_i - A x_i ||_1) 5.453703e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.853046e-02 (SUCCESS) Start 1451: shm_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpbegin Test #1456: shm_example_simple_lap_d_facto2_sched4_kway_rqrrtend ....................***Timeout 272.72 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.139558e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.274821e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.166568e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.317298e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.206611e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.355211e-03 s Time to initialize coeftab 5.809078e-02 s Time to factorize 3.057538e+00 s ( 3.27 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 3.366736e+00 s Time for refinement 3.340681e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.666159e-16 max(|| b_i - A x_i ||_1) 1.784035e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.241794e-03 (SUCCESS) Start 1456: shm_example_simple_lap_d_facto2_sched4_kway_rqrrtend Test #1458: shm_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtend .........***Timeout 272.67 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.546883e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.226767e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.115500e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.328153e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.603478e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.470615e-03 s Time to initialize coeftab 2.618186e-02 s Time to factorize 2.876360e+00 s ( 3.47 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 4.060226e+00 s Time for refinement 6.917224e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.552885e-16 max(|| b_i - A x_i ||_1) 1.776823e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.232731e-03 (SUCCESS) Start 1458: shm_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtend Test #1459: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpilu0 ...................***Timeout 272.66 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.418773e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.536388e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.660000e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.383529e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.484711e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.041510e-02 s Time to initialize coeftab 1.440292e-01 s Time to factorize 1.210585e+00 s ( 8.25 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 2.971165e+00 s - iteration 1 : total iteration time 1.36 s error 5.6758e-15 Time for refinement 3.245106e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.680008e-15 max(|| b_i - A x_i ||_1) 5.415929e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.805581e-03 (SUCCESS) Start 1459: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpilu0 Test #1460: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpilu1 ...................***Timeout 272.75 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.128009e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.872681e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.956882e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.177067e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.184827e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.167139e-02 s Time to initialize coeftab 7.458687e-02 s Time to factorize 1.172583e+00 s ( 8.52 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.941367e+00 s - iteration 1 : total iteration time 6.82 s error 2.3229e-15 Time for refinement 1.171963e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.329902e-15 max(|| b_i - A x_i ||_1) 1.343821e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.688627e-03 (SUCCESS) Start 1460: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpilu1 Test #1464: shm_example_simple_lap_c_facto0_sched4_kway_svdend ......................***Timeout 273.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.659397e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.957766e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.439709e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 3.912004e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.412261e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.072863e-02 s Time to initialize coeftab 1.077497e+00 s Time to factorize 3.182661e+00 s ( 6.37 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 7.343816e+00 s Time for refinement 9.601951e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.053799e-07 max(|| b_i - A x_i ||_1) 9.166727e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.313039e+00 (SUCCESS) Start 1464: shm_example_simple_lap_c_facto0_sched4_kway_svdend Test #1468: shm_example_simple_lap_c_facto0_sched4_not_pqrcpend .....................***Timeout 274.32 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.385515e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.945546e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.685129e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.373337e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.666104e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.311525e-02 s Time to initialize coeftab 2.884191e+00 s Time to factorize 2.791337e+00 s ( 7.27 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.011539e+00 s Time for refinement 5.196657e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.037248e-07 max(|| b_i - A x_i ||_1) 9.064798e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.287319e+00 (SUCCESS) Start 1468: shm_example_simple_lap_c_facto0_sched4_not_pqrcpend Test #1469: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpbegin ..................***Timeout 274.40 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.045541e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.525811e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.271361e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.650202e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.078688e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.568101e-03 s Time to initialize coeftab 4.196529e-01 s Time to factorize 1.961334e+00 s (10.34 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.101604e+00 s - iteration 1 : total iteration time 1.61 s error 2.0212e-11 Time for refinement 2.908146e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.495188e-08 max(|| b_i - A x_i ||_1) 3.208059e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.094893e-01 (SUCCESS) Start 1469: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpbegin Test #1474: shm_example_simple_lap_c_facto0_sched4_not_rqrcpend .....................***Timeout 277.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.327856e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.645886e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.355794e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.968790e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.847491e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.704213e-02 s Time to initialize coeftab 2.931704e-01 s Time to factorize 2.307774e+00 s ( 8.79 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.813023e+00 s Time for refinement 6.165010e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.007634e-07 max(|| b_i - A x_i ||_1) 8.991464e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.268815e+00 (SUCCESS) Start 1474: shm_example_simple_lap_c_facto0_sched4_not_rqrcpend Test #1477: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpbegin .......***Timeout 277.56 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.763104e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.645830e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.714176e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.344603e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.247693e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.597868e-02 s Time to initialize coeftab 5.900829e-01 s Time to factorize 3.805815e+00 s ( 5.33 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 4.205145e+00 s - iteration 1 : total iteration time 8.27 s error 4.9598e-11 Time for refinement 1.252769e+01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.209792e-08 max(|| b_i - A x_i ||_1) 3.176310e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.014779e-01 (SUCCESS) Start 1477: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpbegin Test #1480: shm_example_simple_lap_c_facto0_sched4_not_tqrcpend .....................***Timeout 277.77 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1480: shm_example_simple_lap_c_facto0_sched4_not_tqrcpend Test #1483: shm_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpbegin .......***Timeout 277.88 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.014681e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.470755e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.006871e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.020286e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.376300e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.749813e-02 s Time to initialize coeftab 1.081531e+00 s Time to factorize 1.046874e+01 s ( 1.94 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.770691e+00 s Time for refinement 5.163734e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.417812e-07 max(|| b_i - A x_i ||_1) 1.227154e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.096477e+00 (SUCCESS) Start 1483: shm_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpbegin Test #1484: shm_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpend .........***Timeout 277.86 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.057160e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.935446e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.744802e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 3.695301e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.109961e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.526730e-03 s Time to initialize coeftab 1.319525e-01 s Time to factorize 2.747843e+00 s ( 7.38 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 5.891978e+00 s Time for refinement 3.774688e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.012923e-07 max(|| b_i - A x_i ||_1) 9.052288e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.284163e+00 (SUCCESS) Start 1484: shm_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpend Test #1487: shm_example_simple_lap_c_facto0_sched4_kway_rqrrtbegin ..................***Timeout 277.75 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.959821e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.279511e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.415442e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 5.024731e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.636756e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.873621e-02 s Time to initialize coeftab 8.405111e-01 s Time to factorize 5.024722e+00 s ( 4.04 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.493720e+00 s Time for refinement 2.407868e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.840430e-07 max(|| b_i - A x_i ||_1) 1.305601e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.294420e+00 (SUCCESS) Start 1487: shm_example_simple_lap_c_facto0_sched4_kway_rqrrtbegin Test #1489: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtbegin .......***Timeout 277.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.825308e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.805269e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.763412e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.755581e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.131924e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.442825e-01 s Time to initialize coeftab 1.293635e+00 s Time to factorize 4.848962e+00 s ( 4.18 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.957340e+00 s Time for refinement 8.729305e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.079145e-07 max(|| b_i - A x_i ||_1) 1.802516e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.548285e+00 (SUCCESS) Start 1489: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtbegin Test #1049: shm_example_simple_lap_c_facto3_sched1_kwayprojections_svdbegin .........***Timeout 277.67 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.705444e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.144572e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.085595e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 5.984994e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.783968e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.593092e-03 s Time to initialize coeftab 7.444396e-01 s Time to factorize 1.283318e+01 s ( 1.58 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.620622e-01 s Time for refinement 4.129287e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.138284e-07 max(|| b_i - A x_i ||_1) 9.831092e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.480678e+00 (SUCCESS) Test #1492: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpilu1 ...................***Timeout 277.65 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.986788e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.275552e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.666269e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 8.545191e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.124523e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.276103e-01 s Time to initialize coeftab 5.877201e-01 s Time to factorize 2.513789e+00 s ( 8.07 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.995748e+00 s Time for refinement 1.077311e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.019648e-07 max(|| b_i - A x_i ||_1) 9.089501e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.293553e+00 (SUCCESS) Start 1492: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpilu1 Test #1501: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpbegin ..................***Timeout 277.75 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.356459e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.811217e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.391458e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.633078e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.105962e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.077806e-01 s Time to initialize coeftab 1.232049e+00 s Time to factorize 3.449435e+00 s ( 6.18 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.282627e+00 s - iteration 1 : total iteration time 1.26 s error 4.9393e-11 Time for refinement 2.415407e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.387344e-08 max(|| b_i - A x_i ||_1) 3.198366e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.070433e-01 (SUCCESS) Start 1501: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpbegin Test #1502: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpend ....................***Timeout 277.74 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.334406e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.707385e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.261276e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.548605e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.397106e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.344011e-03 s Time to initialize coeftab 1.117352e-01 s Time to factorize 1.871326e+00 s (11.39 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 4.326360e+00 s Time for refinement 9.698796e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.064947e-07 max(|| b_i - A x_i ||_1) 8.854058e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.234144e+00 (SUCCESS) Start 1502: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpend Test #1503: shm_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpbegin .......***Timeout 277.67 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.079809e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.027229e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.501347e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.396317e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.134900e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.235609e-02 s Time to initialize coeftab 3.031370e-01 s Time to factorize 3.571537e+00 s ( 5.97 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 6.021745e+00 s Time for refinement 4.965204e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.207172e-07 max(|| b_i - A x_i ||_1) 1.534207e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.871262e+00 (SUCCESS) Start 1503: shm_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpbegin Test #1506: shm_example_simple_lap_c_facto1_sched4_not_rqrcpend .....................***Timeout 277.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.093519e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.088470e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.762184e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.131515e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.127569e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.153593e-01 s Time to initialize coeftab 1.461863e-01 s Time to factorize 2.209538e+00 s ( 9.64 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 9.321747e-01 s Time for refinement 1.589743e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.049359e-07 max(|| b_i - A x_i ||_1) 8.745643e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.206788e+00 (SUCCESS) Start 1506: shm_example_simple_lap_c_facto1_sched4_not_rqrcpend Test #1510: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpend .........***Timeout 277.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.830002e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.888281e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.913292e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.293199e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.139727e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.884299e-01 s Time to initialize coeftab 1.939223e-01 s Time to factorize 1.688334e+00 s (12.62 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 2.868396e+00 s Time for refinement 5.672037e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.093341e-07 max(|| b_i - A x_i ||_1) 8.866999e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.237409e+00 (SUCCESS) Start 1510: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpend Test #1512: shm_example_simple_lap_c_facto1_sched4_not_tqrcpend .....................***Timeout 277.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.222571e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.835536e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.065231e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.348063e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.479238e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.479143e-01 s Time to initialize coeftab 2.338001e-01 s Time to factorize 3.368330e+00 s ( 6.33 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 7.496855e+00 s Time for refinement 2.600175e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.044188e-07 max(|| b_i - A x_i ||_1) 8.782733e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.216146e+00 (SUCCESS) Start 1512: shm_example_simple_lap_c_facto1_sched4_not_tqrcpend Test #1516: shm_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpend .........***Timeout 277.47 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.682684e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.891802e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.634568e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.922351e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.712138e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.359300e-03 s Time to initialize coeftab 4.277065e-02 s Time to factorize 9.477327e-01 s (22.48 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.530439e+00 s Time for refinement 1.074795e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.057611e-07 max(|| b_i - A x_i ||_1) 8.822025e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.226061e+00 (SUCCESS) Start 1516: shm_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpend Test #1520: shm_example_simple_lap_c_facto1_sched4_kway_rqrrtend ....................***Timeout 277.61 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.516877e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.779921e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.829214e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.018967e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.109495e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.985558e-01 s Time to initialize coeftab 2.197454e+00 s Time to factorize 2.510436e+00 s ( 8.49 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.099124e+00 s Time for refinement 8.718939e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.086221e-07 max(|| b_i - A x_i ||_1) 8.881193e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.240991e+00 (SUCCESS) Start 1520: shm_example_simple_lap_c_facto1_sched4_kway_rqrrtend Test #1524: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpilu1 ...................***Timeout 279.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.586915e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.717091e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.994322e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.019568e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.956796e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.990660e-02 s Time to initialize coeftab 4.114612e-01 s Time to factorize 2.975861e+00 s ( 7.16 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.195970e+00 s Time for refinement 1.141214e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.062550e-07 max(|| b_i - A x_i ||_1) 8.931326e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.253641e+00 (SUCCESS) Start 1524: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpilu1 1620/3626 Test #1862: c_mpi_rep_example_personal_lap_s_facto1 .................................***Timeout 295.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Personal Ordering method is: Personal (myorder->permtab/peritab) Time to compute ordering 1.728934e-01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 499500 Fill-in of L 135.000000 Time to compute symbol matrix 2.141665e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.635155e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 499500 Fill-in 135.000000 Number of operations in full-rank: LDL^t 319.92 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.559740e-02 s Time for mapping/scheduling 8.660246e-01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.963367e-03 s Time to initialize coeftab 1.344417e-01 s Time to factorize 1.892822e+01 s (16.90 MFlop/s) Number of operations 1.33 GFlops Number of static pivots 0 Memory usage of coeftab 2.38 Mo Time to solve 3.806658e+00 s Time for refinement 1.079640e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996109e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.706768e-07 max(|| b_i - A x_i ||_1) 1.325168e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.665159e+00 (SUCCESS) max(|| x_i ||_oo) 4.996109e-01 max(|| x0_i - x_i ||_oo) 3.874302e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.754641e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996109e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.706768e-07 max(|| b_i - A x_i ||_1) 1.325168e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.665159e+00 (SUCCESS) max(|| x_i ||_oo) 4.996109e-01 max(|| x0_i - x_i ||_oo) 3.874302e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.754641e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996109e-01 || A ||_1 5.112398e-02 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.706768e-07 max(|| b_i - A x_i ||_1) 1.325168e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.665159e+00 (SUCCESS) max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996109e-01 max(|| x_i ||_oo) 4.996109e-01 max(|| x0_i - x_i ||_oo) 3.874302e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.754641e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.706768e-07 max(|| b_i - A x_i ||_1) 1.325168e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.665159e+00 (SUCCESS) max(|| x_i ||_oo) 4.996109e-01 max(|| x0_i - x_i ||_oo) 3.874302e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.754641e-01 (SUCCESS) Start 1862: c_mpi_rep_example_personal_lap_s_facto1 Test #1041: shm_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtbegin .......***Timeout 295.76 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.205616e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.420951e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.989127e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.553706e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.246591e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.114659e-02 s Time to initialize coeftab 3.836413e-01 s Time to factorize 1.237998e+01 s ( 3.23 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 6.441417e-01 s - iteration 1 : total iteration time 0.634 s error 4.9375e-11 Time for refinement 1.717405e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.589900e-08 max(|| b_i - A x_i ||_1) 3.249002e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.198205e-01 (SUCCESS) Test #1045: shm_example_simple_lap_c_facto3_sched1_not_svdbegin .....................***Timeout 295.73 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.137051e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.457812e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.985827e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.443146e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.199920e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.477774e-02 s Time to initialize coeftab 6.723011e-01 s Time to factorize 4.845881e+00 s ( 4.19 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 4.855407e-01 s Time for refinement 5.164246e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.174310e-07 max(|| b_i - A x_i ||_1) 9.792557e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.470955e+00 (SUCCESS) Test #1450: shm_example_simple_lap_d_facto2_sched4_kway_tqrcpend ....................***Timeout 295.74 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.171530e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.125250e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.651574e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.704123e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.515347e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.130582e-02 s Time to initialize coeftab 2.245571e+00 s Time to factorize 3.202188e+00 s ( 3.12 MFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 9.203251e-01 s Time for refinement 9.635770e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.629026e-16 max(|| b_i - A x_i ||_1) 1.788233e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.247069e-03 (SUCCESS) Start 1450: shm_example_simple_lap_d_facto2_sched4_kway_tqrcpend Test #1527: shm_example_simple_lap_c_facto2_sched4_kway_svdbegin ....................***Timeout 295.86 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.261496e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.049690e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.262837e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 2.436588e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.517803e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.078423e-02 s Time to initialize coeftab 1.579355e+00 s Time to factorize 9.174717e+00 s ( 4.36 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 3.212967e+00 s Time for refinement 6.296780e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.026105e-07 max(|| b_i - A x_i ||_1) 8.997822e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.270420e+00 (SUCCESS) Start 1527: shm_example_simple_lap_c_facto2_sched4_kway_svdbegin 1622/3626 Test #1867: c_mpi_rep_example_personal_lap_c_facto0 .................................***Timeout 295.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Personal Ordering method is: Personal (myorder->permtab/peritab) Time to compute ordering 4.660614e-02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 499500 Fill-in of L 135.000000 Time to compute symbol matrix 1.743809e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.124016e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 499500 Fill-in 135.000000 Number of operations in full-rank: LL^h 1.25 GFlops Prediction: Model AMD 6180 MKL Time to factorize 5.430933e-02 s Time for mapping/scheduling 1.006658e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.096760e-03 s Time to initialize coeftab 2.352674e+00 s Time to factorize 5.082410e+01 s (25.09 MFlop/s) Number of operations 5.33 GFlops Number of static pivots 0 Memory usage of coeftab 4.77 Mo Time to solve 5.530392e+00 s Time for refinement 4.031080e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822265e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822265e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.828903e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.828903e-07 max(|| b_i - A x_i ||_1) 1.405896e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.547497e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.405896e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.547497e+00 (SUCCESS) max(|| x_i ||_oo) 6.822265e-01 max(|| x0_i - x_i ||_oo) 4.532011e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 6.642973e-01 (SUCCESS) max(|| x_i ||_oo) 6.822265e-01 max(|| x0_i - x_i ||_oo) 4.532011e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 6.642973e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822265e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.828903e-07 max(|| b_i - A x_i ||_1) 1.405896e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.547497e+00 (SUCCESS) max(|| x_i ||_oo) 6.822265e-01 max(|| x0_i - x_i ||_oo) 4.532011e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 6.642973e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822265e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.828903e-07 max(|| b_i - A x_i ||_1) 1.405896e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.547497e+00 (SUCCESS) max(|| x_i ||_oo) 6.822265e-01 max(|| x0_i - x_i ||_oo) 4.532011e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 6.642973e-01 (SUCCESS) Start 1867: c_mpi_rep_example_personal_lap_c_facto0 1622/3626 Test #1871: c_mpi_rep_example_personal_lap_c_facto4 .................................***Timeout 297.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Personal Ordering method is: Personal (myorder->permtab/peritab) Time to compute ordering 7.090279e-02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 499500 Fill-in of L 135.000000 Time to compute symbol matrix 2.025740e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.376410e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 499500 Fill-in 135.000000 Number of operations in full-rank: LDL^h 1.25 GFlops Prediction: Model AMD 6180 MKL Time to factorize 5.559740e-02 s Time for mapping/scheduling 1.357879e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.091244e-02 s Time to initialize coeftab 6.508640e-01 s Time to factorize 2.293070e+01 s (55.92 MFlop/s) Number of operations 5.33 GFlops Number of static pivots 0 Memory usage of coeftab 4.77 Mo Time to solve 2.199362e+00 s Time for refinement 1.535807e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.655161e-07 max(|| b_i - A x_i ||_1) 1.315255e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.318781e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.655161e-07 max(|| x_i ||_oo) 6.822264e-01 max(|| x0_i - x_i ||_oo) 4.470348e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 6.552589e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 1.315255e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.318781e+00 (SUCCESS) max(|| x_i ||_oo) 6.822264e-01 max(|| x0_i - x_i ||_oo) 4.470348e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 6.552589e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.655161e-07 max(|| b_i - A x_i ||_1) 1.315255e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.318781e+00 (SUCCESS) max(|| x_i ||_oo) 6.822264e-01 max(|| x0_i - x_i ||_oo) 4.470348e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 6.552589e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.655161e-07 max(|| b_i - A x_i ||_1) 1.315255e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.318781e+00 (SUCCESS) max(|| x_i ||_oo) 6.822264e-01 max(|| x0_i - x_i ||_oo) 4.470348e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 6.552589e-01 (SUCCESS) Start 1871: c_mpi_rep_example_personal_lap_c_facto4 1622/3626 Test #1877: c_mpi_rep_example_simple_scotch_rsa .....................................***Timeout 300.21 sec RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.320357e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 1607873 Fill-in of L 39.664331 Time to compute symbol matrix 1.661276e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.241007e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 1607873 Fill-in 39.664331 Number of operations in full-rank: LDL^t 644.52 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.714741e-02 s Time for mapping/scheduling 3.756206e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.811151e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.732714e-02 s Time to initialize coeftab 1.717453e-01 s Time to factorize 2.544670e+01 s (25.33 MFlop/s) Number of operations 3.22 GFlops Number of static pivots 5000 Memory usage of coeftab 1.82 Mo Time to solve 4.409361e+00 s Time for refinement 7.290064e+00 s || A ||_1 2.170513e-01 max(|| b_i ||_oo) 9.894407e-02 max(|| x_i ||_oo) 5.347105e-01 || A ||_1 2.170513e-01 max(|| b_i ||_oo) 9.894407e-02 max(|| x_i ||_oo) 5.347105e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.213426e-14 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.213426e-14 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) || A ||_1 2.170513e-01 max(|| b_i ||_oo) 9.894407e-02 max(|| x_i ||_oo) 5.347105e-01 || A ||_1 2.170513e-01 max(|| b_i ||_oo) 9.894407e-02 max(|| x_i ||_oo) 5.347105e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.213426e-14 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.213426e-14 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) Start 1877: c_mpi_rep_example_simple_scotch_rsa 1622/3626 Test #1884: c_mpi_rep_example_simple_single_mm2 .....................................***Timeout 300.65 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: IJV N: 1280 nnz: 12029 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.145605e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 10749 Fill-in of L 0.893590 Time to compute symbol matrix 1.558507e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.469566e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 21498 Fill-in 1.787181 Number of operations in full-rank: LU 1.08 MFlops Prediction: Model AMD 6180 MKL Time to factorize 8.513996e-04 s Time for mapping/scheduling 1.804266e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.404528e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.108048e-01 s Time to initialize coeftab 1.453613e-01 s Time to factorize 8.166899e-01 s ( 1.33 MFlop/s) Number of operations 4.80 MFlops Number of static pivots 0 Memory usage of coeftab 427 Ko Time to solve 1.251025e+00 s Time for refinement 4.175564e+00 s || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.156448e-01 max(|| x_i ||_oo) 6.963723e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.156448e-01 max(|| x_i ||_oo) 6.963723e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.464863e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.464863e-16 max(|| b_i - A x_i ||_1) 8.234284e-19 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.786812e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 8.234284e-19 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.786812e-04 (SUCCESS) || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.156448e-01 max(|| x_i ||_oo) 6.963723e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.156448e-01 max(|| x_i ||_oo) 6.963723e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.464863e-16 max(|| b_i - A x_i ||_1) 8.234284e-19 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.786812e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.464863e-16 max(|| b_i - A x_i ||_1) 8.234284e-19 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.786812e-04 (SUCCESS) Start 1884: c_mpi_rep_example_simple_single_mm2 1622/3626 Test #1886: c_mpi_rep_example_step-by-step_single_mm ................................***Timeout 300.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: IJV N: 841 nnz: 2465 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.330547e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 24350 Fill-in of L 9.878296 Time to compute symbol matrix 1.431593e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.866299e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 24350 Fill-in 9.878296 Number of operations in full-rank: LDL^t 3.58 MFlops Prediction: Model AMD 6180 MKL Time to factorize 2.058387e-04 s Time for mapping/scheduling 1.617748e-01 s Time to initialize internal csc 4.787849e-03 s Time to initialize coeftab 7.380489e-02 s Time to factorize 1.176482e+00 s ( 3.05 MFlop/s) Number of operations 17.02 MFlops Number of static pivots 0 Memory usage of coeftab 134 Ko Time to solve 1.715109e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 2.138407e+01 s || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 8.589990e-02 || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.396117e-15 max(|| b_i - A x_i ||_1) 1.611415e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.932722e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.396117e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.396117e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.396117e-15 max(|| b_i - A x_i ||_1) 1.611415e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.932722e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.611415e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.932722e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.611415e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.932722e-03 (SUCCESS) Time to solve 1.557981e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 2.292154e+01 s || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.067560e-15 max(|| b_i - A x_i ||_1) 1.600031e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.749595e-03 (SUCCESS) Time to initialize internal csc 9.591280e-04 s || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.067560e-15 max(|| b_i - A x_i ||_1) 1.600031e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.749595e-03 (SUCCESS) || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 8.589990e-02 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.067560e-15 max(|| b_i - A x_i ||_1) 1.600031e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.749595e-03 (SUCCESS) max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.067560e-15 max(|| b_i - A x_i ||_1) 1.600031e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.749595e-03 (SUCCESS) Time to initialize coeftab 1.158782e-01 s Time to factorize 1.829843e+00 s ( 1.96 MFlop/s) Number of operations 17.02 MFlops Number of static pivots 0 Memory usage of coeftab 134 Ko Time to solve 5.220907e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 2.870513e+01 s || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.746331e-15 max(|| b_i - A x_i ||_1) 1.550929e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.483896e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.746331e-15 max(|| b_i - A x_i ||_1) 1.550929e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.483896e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.746331e-15 max(|| b_i - A x_i ||_1) 1.550929e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.483896e-03 (SUCCESS) || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.746331e-15 max(|| b_i - A x_i ||_1) 1.550929e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.483896e-03 (SUCCESS) Time to solve 1.716630e+00 s Time for refinement 1.346913e+01 s || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 8.589990e-02 max(|| b_i ||_oo) 4.291881e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.473492e-15 max(|| b_i - A x_i ||_1) 1.522360e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.382877e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.473492e-15 max(|| b_i - A x_i ||_1) 1.522360e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.382877e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.473492e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.473492e-15 max(|| b_i - A x_i ||_1) 1.522360e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.382877e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.522360e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.382877e-03 (SUCCESS) WARNING: WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 1886: c_mpi_rep_example_step-by-step_single_mm 1622/3626 Test #1888: c_mpi_rep_example_step-by-step_single_mm2 ...............................***Timeout 301.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: IJV N: 1280 nnz: 12029 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.241330e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 10749 Fill-in of L 0.893590 Time to compute symbol matrix 1.777075e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.276485e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 21498 Fill-in 1.787181 Number of operations in full-rank: LU 1.08 MFlops Prediction: Model AMD 6180 MKL Time to factorize 8.513996e-04 s Time for mapping/scheduling 4.084523e-01 s Time to initialize internal csc 6.626319e-02 s Time to initialize coeftab 6.119862e-02 s Time to factorize 4.002781e-01 s ( 2.71 MFlop/s) Number of operations 4.80 MFlops Number of static pivots 0 Memory usage of coeftab 427 Ko Time to solve 6.020300e-01 s WARNING: WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 9.071228e+00 s || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) Time to solve 6.305035e-01 s WARNING: WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 2.231960e+01 s || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) Time to initialize internal csc 6.567285e-01 s Time to initialize coeftab 1.253777e-01 s Time to factorize 1.728854e-01 s ( 6.27 MFlop/s) Number of operations 4.80 MFlops Number of static pivots 0 Memory usage of coeftab 427 Ko Time to solve 1.287145e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 1.971474e+01 s || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) Time to solve 6.138277e+00 s WARNING: WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 2.876169e+01 s || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) Start 1888: c_mpi_rep_example_step-by-step_single_mm2 1622/3626 Test #1889: c_mpi_rep_example_simple_refine_cg ......................................***Timeout 301.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.302741e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 1607873 Fill-in of L 39.664331 Time to compute symbol matrix 2.300239e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.522224e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 1607873 Fill-in 39.664331 Number of operations in full-rank: LDL^t 644.52 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.714741e-02 s Time for mapping/scheduling 3.722263e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.890304e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.065017e-02 s Time to initialize coeftab 8.027892e-02 s Time to factorize 2.852553e+01 s (22.59 MFlop/s) Number of operations 3.22 GFlops Number of static pivots 5000 Memory usage of coeftab 1.82 Mo Time to solve 8.594473e+00 s Time for refinement 6.864368e+00 s || A ||_1 2.170513e-01 || A ||_1 2.170513e-01 max(|| b_i ||_oo) 9.894407e-02 max(|| x_i ||_oo) 5.347105e-01 max(|| b_i ||_oo) 9.894407e-02 max(|| x_i ||_oo) 5.347105e-01 || A ||_1 2.170513e-01 max(|| b_i ||_oo) 9.894407e-02 max(|| x_i ||_oo) 5.347105e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.213426e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.213426e-14 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.213426e-14 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) || A ||_1 2.170513e-01 max(|| b_i ||_oo) 9.894407e-02 max(|| x_i ||_oo) 5.347105e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.213426e-14 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) Start 1889: c_mpi_rep_example_simple_refine_cg 1622/3626 Test #1890: c_mpi_rep_example_simple_refine_gmres ...................................***Timeout 301.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.376107e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 51109 Fill-in of L 7.452464 Time to compute symbol matrix 1.386919e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.991622e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 102218 Fill-in 14.904929 Number of operations in full-rank: LU 5.50 MFlops Prediction: Model AMD 6180 MKL Time to factorize 7.121319e-04 s Time for mapping/scheduling 5.699947e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.530334e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.498358e-03 s Time to initialize coeftab 1.731622e-01 s Time to factorize 7.150914e+00 s (787.45 KFlop/s) Number of operations 9.09 MFlops Number of static pivots 0 Memory usage of coeftab 299 Ko Time to solve 3.642304e+00 s Time for refinement 1.068755e+00 s || A ||_1 3.076897e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 3.076897e-01 || A ||_1 3.076897e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 3.076897e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.605481e-16 max(|| b_i - A x_i ||_1) 1.457588e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.141424e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.605481e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.605481e-16 max(|| b_i - A x_i ||_1) 1.457588e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.141424e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.457588e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.141424e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.605481e-16 max(|| b_i - A x_i ||_1) 1.457588e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.141424e-03 (SUCCESS) Start 1890: c_mpi_rep_example_simple_refine_gmres 1622/3626 Test #1893: c_mpi_rep_example_refinement_lap_s_refine_gmres_sym .....................***Timeout 301.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.957988e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.040277e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.439427e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.328211e+00 s Time to initialize internal csc 2.570692e-02 s - iteration 1 : total iteration time 3.99 s error 0.20451 - iteration 2 : total iteration time 2 s error 0.05944 - iteration 3 : total iteration time 2.53 s error 0.019007 - iteration 4 : total iteration time 6.93 s error 0.0066596 - iteration 5 : total iteration time 3.35 s error 0.0023054 - iteration 6 : total iteration time 7.2 s error 0.00077935 - iteration 7 : total iteration time 11.2 s error 0.00027759 - iteration 8 : total iteration time 10.3 s error 9.3505e-05 - iteration 9 : total iteration time 14.2 s error 3.0631e-05 - iteration 10 : total iteration time 5.25 s error 1.0018e-05 - iteration 11 : total iteration time 9.34 s error 3.0985e-06 - iteration 12 : total iteration time 6.25 s error 9.3879e-07 Time for refinement 8.361858e+01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996102e-01 || A ||_1 5.112398e-02 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996102e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996102e-01 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996102e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.509874e-07 max(|| b_i - A x_i ||_1) 5.197932e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.531538e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.509874e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.509874e-07 max(|| b_i - A x_i ||_1) 5.197932e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.531538e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 5.197932e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.531538e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.509874e-07 max(|| b_i - A x_i ||_1) 5.197932e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.531538e+00 (SUCCESS) Start 1893: c_mpi_rep_example_refinement_lap_s_refine_gmres_sym 1622/3626 Test #1895: c_mpi_rep_example_refinement_lap_d_refine_cg_sym ........................***Timeout 301.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.090282e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.272661e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.440493e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.530563e-01 s Time to initialize internal csc 2.438049e-02 s - iteration 1 : total iteration time 1.85 s error 0.20926 - iteration 2 : total iteration time 2.94 s error 0.062122 - iteration 3 : total iteration time 6.18 s error 0.02006 - iteration 4 : total iteration time 2.29 s error 0.0071103 - iteration 5 : total iteration time 1.86 s error 0.0024573 - iteration 6 : total iteration time 1.74 s error 0.00082811 - iteration 7 : total iteration time 4.89 s error 0.00029707 - iteration 8 : total iteration time 4.59 s error 9.9308e-05 - iteration 9 : total iteration time 8.06 s error 3.242e-05 - iteration 10 : total iteration time 8.04 s error 1.06e-05 - iteration 11 : total iteration time 3 s error 3.2564e-06 - iteration 12 : total iteration time 2.26 s error 9.7881e-07 - iteration 13 : total iteration time 8.49 s error 2.9112e-07 - iteration 14 : total iteration time 1.66 s error 8.5895e-08 - iteration 15 : total iteration time 2.87 s error 2.5018e-08 - iteration 16 : total iteration time 1.56 s error 7.0466e-09 - iteration 17 : total iteration time 1.78 s error 1.9646e-09 - iteration 18 : total iteration time 1.51 s error 5.6405e-10 - iteration 19 : total iteration time 5.62 s error 1.8841e-10 - iteration 20 : total iteration time 1.27 s error 6.9878e-11 - iteration 21 : total iteration time 3.48 s error 2.6229e-11 - iteration 22 : total iteration time 2.35 s error 7.7221e-12 - iteration 23 : total iteration time 2.07 s error 2.1952e-12 - iteration 24 : total iteration time 1.02 s error 6.6583e-13 Time for refinement 8.439605e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.658304e-13 max(|| b_i - A x_i ||_1) 3.912524e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.916422e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.658304e-13 max(|| b_i - A x_i ||_1) 3.912524e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.916422e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.658304e-13 max(|| b_i - A x_i ||_1) 3.912524e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.916422e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.658304e-13 max(|| b_i - A x_i ||_1) 3.912524e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.916422e+00 (SUCCESS) Start 1895: c_mpi_rep_example_refinement_lap_d_refine_cg_sym 1622/3626 Test #1897: c_mpi_rep_example_refinement_lap_d_refine_bicgstab_sym ..................***Timeout 301.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.747031e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.962614e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.741623e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.349614e-01 s Time to initialize internal csc 2.111204e-03 s - iteration 1 : total iteration time 14.6 s error 0.07074 - iteration 2 : total iteration time 4.32 s error 0.011538 - iteration 3 : total iteration time 4.25 s error 0.0023186 - iteration 4 : total iteration time 5.23 s error 0.0004819 - iteration 5 : total iteration time 3.49 s error 9.5045e-05 - iteration 6 : total iteration time 6.43 s error 1.892e-05 - iteration 7 : total iteration time 3.3 s error 3.6759e-06 - iteration 8 : total iteration time 6.34 s error 6.2371e-07 - iteration 9 : total iteration time 11.6 s error 9.8696e-08 - iteration 10 : total iteration time 11.3 s error 1.5113e-08 - iteration 11 : total iteration time 13.1 s error 2.1077e-09 - iteration 12 : total iteration time 3.29 s error 2.8188e-10 - iteration 13 : total iteration time 3.09 s error 3.9203e-11 - iteration 14 : total iteration time 8 s error 5.9627e-12 - iteration 15 : total iteration time 2.54 s error 1.1146e-12 - iteration 16 : total iteration time 5.39 s error 2.4047e-13 Time for refinement 1.071090e+02 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.404599e-13 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.404599e-13 max(|| b_i - A x_i ||_1) 1.037788e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.304070e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.037788e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.304070e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.404599e-13 max(|| b_i - A x_i ||_1) 1.037788e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.304070e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.404599e-13 max(|| b_i - A x_i ||_1) 1.037788e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.304070e+00 (SUCCESS) Start 1897: c_mpi_rep_example_refinement_lap_d_refine_bicgstab_sym 1622/3626 Test #1898: c_mpi_rep_example_refinement_lap_c_refine_cg_her ........................***Timeout 301.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex32 Format: CSC N: 1000 nnz: 11476 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.093639e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 84938 Fill-in of L 7.401359 Time to compute symbol matrix 1.692523e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.615329e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 169876 Fill-in 14.802719 Number of operations in full-rank: LU 62.91 MFlops Prediction: Model AMD 6180 MKL Time to factorize 2.138097e-03 s Time for mapping/scheduling 1.049008e+00 s Time to initialize internal csc 7.200174e-03 s - iteration 1 : total iteration time 4.12 s error 0.22938 - iteration 2 : total iteration time 1.68 s error 0.075299 - iteration 3 : total iteration time 2.19 s error 0.030345 - iteration 4 : total iteration time 5.51 s error 0.0096614 - iteration 5 : total iteration time 1.61 s error 0.0046779 - iteration 6 : total iteration time 1.4 s error 0.0022192 - iteration 7 : total iteration time 2.53 s error 0.00074571 - iteration 8 : total iteration time 1.59 s error 0.00022577 - iteration 9 : total iteration time 4.63 s error 0.00010136 - iteration 10 : total iteration time 2.95 s error 3.8357e-05 - iteration 11 : total iteration time 5.8 s error 1.7899e-05 - iteration 12 : total iteration time 5.54 s error 6.9732e-06 - iteration 13 : total iteration time 6.74 s error 2.0633e-06 - iteration 14 : total iteration time 2.95 s error 7.9663e-07 Time for refinement 5.044216e+01 s || A ||_1 5.530729e-02 max(|| b_i ||_oo) 2.785291e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.208071e-07 max(|| b_i - A x_i ||_1) 3.927347e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.072077e+01 (SUCCESS) || A ||_1 5.530729e-02 max(|| b_i ||_oo) 2.785291e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.208071e-07 max(|| b_i - A x_i ||_1) 3.927347e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.072077e+01 (SUCCESS) || A ||_1 5.530729e-02 || A ||_1 5.530729e-02 max(|| b_i ||_oo) 2.785291e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i ||_oo) 2.785291e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.208071e-07 max(|| b_i - A x_i ||_1) 3.927347e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.072077e+01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.208071e-07 max(|| b_i - A x_i ||_1) 3.927347e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.072077e+01 (SUCCESS) Start 1898: c_mpi_rep_example_refinement_lap_c_refine_cg_her 1622/3626 Test #1899: c_mpi_rep_example_refinement_lap_c_refine_gmres_her .....................***Timeout 301.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Complex32 Format: CSC N: 1000 nnz: 11476 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.029275e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 84938 Fill-in of L 7.401359 Time to compute symbol matrix 3.509575e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.063080e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 169876 Fill-in 14.802719 Number of operations in full-rank: LU 62.91 MFlops Prediction: Model AMD 6180 MKL Time to factorize 2.138097e-03 s Time for mapping/scheduling 9.741664e-01 s Time to initialize internal csc 1.025135e-01 s - iteration 1 : total iteration time 0.874 s error 0.22315 - iteration 2 : total iteration time 1.44 s error 0.071346 - iteration 3 : total iteration time 2.23 s error 0.027924 - iteration 4 : total iteration time 5.08 s error 0.0091303 - iteration 5 : total iteration time 2.47 s error 0.0041632 - iteration 6 : total iteration time 2.9 s error 0.0019584 - iteration 7 : total iteration time 5.09 s error 0.0006969 - iteration 8 : total iteration time 10.6 s error 0.00021478 - iteration 9 : total iteration time 12.1 s error 9.167e-05 - iteration 10 : total iteration time 13.7 s error 3.5488e-05 - iteration 11 : total iteration time 4.83 s error 1.7481e-05 - iteration 12 : total iteration time 4.2 s error 1.1734e-05 - iteration 13 : total iteration time 8.02 s error 4.7999e-06 - iteration 14 : total iteration time 5.39 s error 2.1237e-06 - iteration 15 : total iteration time 2.79 s error 1.0848e-06 - iteration 16 : total iteration time 3.09 s error 5.1146e-07 Time for refinement 8.879827e+01 s || A ||_1 5.530729e-02 max(|| b_i ||_oo) 2.785291e-02 max(|| x_i ||_oo) 6.822262e-01 || A ||_1 5.530729e-02 max(|| b_i ||_oo) 2.785291e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.665992e-07 max(|| b_i - A x_i ||_1) 2.549875e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.960579e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.665992e-07 max(|| b_i - A x_i ||_1) 2.549875e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.960579e+00 (SUCCESS) || A ||_1 5.530729e-02 max(|| b_i ||_oo) 2.785291e-02 max(|| x_i ||_oo) 6.822262e-01 || A ||_1 5.530729e-02 max(|| b_i ||_oo) 2.785291e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.665992e-07 max(|| b_i - A x_i ||_1) 2.549875e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.960579e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.665992e-07 max(|| b_i - A x_i ||_1) 2.549875e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.960579e+00 (SUCCESS) Start 1899: c_mpi_rep_example_refinement_lap_c_refine_gmres_her 1622/3626 Test #1902: c_mpi_rep_example_refinement_lap_c_refine_gmres_sym .....................***Timeout 302.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.769931e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.645485e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.386407e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.090997e-01 s Time to initialize internal csc 1.200699e-02 s - iteration 1 : total iteration time 0.678 s error 0.20013 - iteration 2 : total iteration time 3.79 s error 0.056488 - iteration 3 : total iteration time 1.54 s error 0.017842 - iteration 4 : total iteration time 1.39 s error 0.0060829 - iteration 5 : total iteration time 2.03 s error 0.0021257 - iteration 6 : total iteration time 5.77 s error 0.00075052 - iteration 7 : total iteration time 2.44 s error 0.00026229 - iteration 8 : total iteration time 2.36 s error 8.7579e-05 - iteration 9 : total iteration time 2.4 s error 2.9068e-05 - iteration 10 : total iteration time 9.21 s error 9.6391e-06 - iteration 11 : total iteration time 16.3 s error 3.0472e-06 - iteration 12 : total iteration time 14.3 s error 1.4447e-06 - iteration 13 : total iteration time 4.42 s error 6.3499e-07 Time for refinement 6.849881e+01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822265e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.557033e-07 max(|| b_i - A x_i ||_1) 3.732027e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.417021e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822265e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822265e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822265e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.557033e-07 max(|| b_i - A x_i ||_1) 3.732027e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.417021e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.557033e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.557033e-07 max(|| b_i - A x_i ||_1) 3.732027e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.417021e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 3.732027e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.417021e+00 (SUCCESS) Start 1902: c_mpi_rep_example_refinement_lap_c_refine_gmres_sym 1622/3626 Test #1903: c_mpi_rep_example_refinement_lap_c_refine_bicgstab_sym ..................***Timeout 302.60 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.275099e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.223444e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.975007e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.874109e-01 s Time to initialize internal csc 2.124214e-02 s - iteration 1 : total iteration time 4.45 s error 0.067713 - iteration 2 : total iteration time 8.24 s error 0.010548 - iteration 3 : total iteration time 16.8 s error 0.002058 - iteration 4 : total iteration time 7.62 s error 0.00043473 - iteration 5 : total iteration time 9.03 s error 9.1545e-05 - iteration 6 : total iteration time 4.9 s error 1.8273e-05 - iteration 7 : total iteration time 8.6 s error 3.4435e-06 - iteration 8 : total iteration time 5.18 s error 5.9495e-07 Time for refinement 6.628048e+01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112398e-02 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.059824e-07 max(|| b_i - A x_i ||_1) 2.586746e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.527135e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.059824e-07 max(|| b_i - A x_i ||_1) 2.586746e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.527135e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.059824e-07 max(|| b_i - A x_i ||_1) 2.586746e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.527135e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.059824e-07 max(|| b_i - A x_i ||_1) 2.586746e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.527135e+00 (SUCCESS) Start 1903: c_mpi_rep_example_refinement_lap_c_refine_bicgstab_sym 1622/3626 Test #1907: c_mpi_rep_example_refinement_lap_z_refine_cg_sym ........................***Timeout 302.87 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.797605e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.924287e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.192033e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.804371e+00 s Time to initialize internal csc 5.612847e-02 s - iteration 1 : total iteration time 1.38 s error 0.20457 - iteration 2 : total iteration time 2.01 s error 0.058883 - iteration 3 : total iteration time 0.907 s error 0.018804 - iteration 4 : total iteration time 6.01 s error 0.0064705 - iteration 5 : total iteration time 1.33 s error 0.0022688 - iteration 6 : total iteration time 1.76 s error 0.00080218 - iteration 7 : total iteration time 1.84 s error 0.00027994 - iteration 8 : total iteration time 1.51 s error 9.2911e-05 - iteration 9 : total iteration time 5.03 s error 3.0814e-05 - iteration 10 : total iteration time 4.03 s error 1.0212e-05 - iteration 11 : total iteration time 7.97 s error 3.1309e-06 - iteration 12 : total iteration time 8.56 s error 9.4295e-07 - iteration 13 : total iteration time 3.1 s error 2.8244e-07 - iteration 14 : total iteration time 7.15 s error 8.3271e-08 - iteration 15 : total iteration time 4.11 s error 2.4241e-08 - iteration 16 : total iteration time 1.89 s error 7.1239e-09 - iteration 17 : total iteration time 2.59 s error 1.9923e-09 - iteration 18 : total iteration time 2.18 s error 5.4819e-10 - iteration 19 : total iteration time 1.46 s error 1.674e-10 - iteration 20 : total iteration time 1.82 s error 6.409e-11 - iteration 21 : total iteration time 3.97 s error 2.4264e-11 - iteration 22 : total iteration time 1.81 s error 7.0628e-12 - iteration 23 : total iteration time 5.28 s error 1.9608e-12 - iteration 24 : total iteration time 0.844 s error 5.9077e-13 Time for refinement 8.160076e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.907550e-13 max(|| b_i - A x_i ||_1) 3.424757e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641824e+00 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.907550e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.907550e-13 max(|| b_i - A x_i ||_1) 3.424757e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641824e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 3.424757e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641824e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.907550e-13 max(|| b_i - A x_i ||_1) 3.424757e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641824e+00 (SUCCESS) Start 1907: c_mpi_rep_example_refinement_lap_z_refine_cg_sym 1622/3626 Test #1912: c_mpi_rep_example_simple_mixed_refine_bicgstab ..........................***Timeout 302.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.097602e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 51109 Fill-in of L 7.452464 Time to compute symbol matrix 1.288637e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.242661e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 102218 Fill-in 14.904929 Number of operations in full-rank: LU 5.50 MFlops Prediction: Model AMD 6180 MKL Time to factorize 7.121319e-04 s Time for mapping/scheduling 1.003139e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.297228e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.225665e-02 s Time to initialize coeftab 8.799538e-02 s Time to factorize 1.460819e+00 s ( 3.76 MFlop/s) Number of operations 9.09 MFlops Number of static pivots 0 Memory usage of coeftab 150 Ko Time to solve 3.184052e+00 s - iteration 1 : total iteration time 9.83 s error 9.5813e-16 Time for refinement 1.112482e+01 s || A ||_1 3.076897e-01 || A ||_1 3.076897e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 3.076897e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.524705e-16 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.524705e-16 max(|| b_i - A x_i ||_1) 3.802455e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.977668e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.524705e-16 max(|| b_i - A x_i ||_1) 3.802455e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.977668e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 3.802455e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.977668e-03 (SUCCESS) || A ||_1 3.076897e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.524705e-16 max(|| b_i - A x_i ||_1) 3.802455e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.977668e-03 (SUCCESS) Start 1912: c_mpi_rep_example_simple_mixed_refine_bicgstab 1622/3626 Test #1919: c_mpi_rep_example_simple_mixed_lap_z_refine_cg_sym ......................***Timeout 302.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.928099e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.921550e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.002794e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.287736e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.066947e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.431609e-02 s Time to initialize coeftab 3.865697e-01 s Time to factorize 3.780985e+00 s (10.57 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 3.076735e+00 s - iteration 1 : total iteration time 12.9 s error 4.5779e-14 Time for refinement 2.261428e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.577561e-14 max(|| b_i - A x_i ||_1) 1.357898e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.426437e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.577561e-14 max(|| b_i - A x_i ||_1) 1.357898e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.426437e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.577561e-14 max(|| b_i - A x_i ||_1) 1.357898e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.426437e-01 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.577561e-14 max(|| b_i - A x_i ||_1) 1.357898e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.426437e-01 (SUCCESS) Start 1919: c_mpi_rep_example_simple_mixed_lap_z_refine_cg_sym 1622/3626 Test #1924: mpi_rep_example_simple_lap_s_facto2_sched0_1d ...........................***Timeout 303.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.304828e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.092780e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.696955e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.206554e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.329479e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.319442e-03 s Time to initialize coeftab 3.391499e-01 s Time to factorize 1.576607e-01 s (63.33 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.781861e-02 s Time for refinement 3.612455e-02 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702563e-07 max(|| b_i - A x_i ||_1) 7.621450e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.576846e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702563e-07 max(|| b_i - A x_i ||_1) 7.621450e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.576846e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702563e-07 max(|| b_i - A x_i ||_1) 7.621450e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.576846e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702563e-07 max(|| b_i - A x_i ||_1) 7.621450e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.576846e-01 (SUCCESS) Start 1924: mpi_rep_example_simple_lap_s_facto2_sched0_1d 1622/3626 Test #1930: mpi_rep_example_simple_lap_c_facto2_sched0_1d ...........................***Timeout 304.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.172402e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.499132e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.914898e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.685365e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.516501e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.763286e-02 s Time to initialize coeftab 3.471169e-01 s Time to factorize 1.171135e+00 s (34.13 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 1.564220e-01 s Time for refinement 1.401477e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.746326e-07 max(|| b_i - A x_i ||_1) 7.762885e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.958808e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.746326e-07 max(|| b_i - A x_i ||_1) 7.762885e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.958808e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.746326e-07 max(|| b_i - A x_i ||_1) 7.762885e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.958808e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.746326e-07 max(|| b_i - A x_i ||_1) 7.762885e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.958808e+00 (SUCCESS) Start 1930: mpi_rep_example_simple_lap_c_facto2_sched0_1d Test #1541: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpbegin .......***Timeout 301.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.783616e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.424651e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.112665e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.234033e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.068303e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.602336e-02 s Time to initialize coeftab 7.417601e-01 s Time to factorize 5.687247e+00 s ( 7.03 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.074219e+00 s - iteration 1 : total iteration time 4.13 s error 4.9098e-11 Time for refinement 5.282607e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.254219e-08 max(|| b_i - A x_i ||_1) 3.119699e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.871934e-01 (SUCCESS) Start 1541: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpbegin 1622/3626 Test #1935: mpi_rep_example_simple_lap_z_facto2_sched0_1d ...........................***Timeout 300.75 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.404181e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.285687e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.820592e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.092940e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.737544e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.206324e-03 s Time to initialize coeftab 1.434340e-01 s Time to factorize 1.107614e+00 s (36.09 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Memory usage of coeftab 548 Ko Time to solve 1.170892e-01 s Time for refinement 8.008221e-02 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.202066e-16 max(|| b_i - A x_i ||_1) 1.630442e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.114158e-03 (SUCCESS) max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.202066e-16 max(|| b_i - A x_i ||_1) 1.630442e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.114158e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.202066e-16 max(|| b_i - A x_i ||_1) 1.630442e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.114158e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.202066e-16 max(|| b_i - A x_i ||_1) 1.630442e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.114158e-03 (SUCCESS) Start 1935: mpi_rep_example_simple_lap_z_facto2_sched0_1d 1622/3626 Test #1939: mpi_rep_example_simple_lap_s_facto1_sched1_1d ...........................***Timeout 296.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.825846e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.369130e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.308930e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.104548e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.954350e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.239267e-03 s Time to initialize coeftab 2.455952e-01 s Time to factorize 2.596294e-01 s (20.16 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 68.5 Ko Time to solve 7.912020e-01 s Time for refinement 5.107405e+00 s || A ||_1 5.112398e-02 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.714864e-07 max(|| b_i - A x_i ||_1) 7.661124e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.626699e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.714864e-07 max(|| b_i - A x_i ||_1) 7.661124e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.626699e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.714864e-07 max(|| b_i - A x_i ||_1) 7.661124e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.626699e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.714864e-07 max(|| b_i - A x_i ||_1) 7.661124e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.626699e-01 (SUCCESS) Start 1939: mpi_rep_example_simple_lap_s_facto1_sched1_1d Test #1543: shm_example_simple_lap_c_facto2_sched4_not_tqrcpbegin ...................***Timeout 295.89 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.629280e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.199168e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.609983e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.866142e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.690828e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.815130e-03 s Time to initialize coeftab 7.300540e-01 s Time to factorize 6.704301e+00 s ( 5.96 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 1.400239e+00 s Time for refinement 8.744320e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.321665e-07 max(|| b_i - A x_i ||_1) 1.327852e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.350567e+00 (SUCCESS) Start 1543: shm_example_simple_lap_c_facto2_sched4_not_tqrcpbegin 1622/3626 Test #1943: mpi_rep_example_simple_lap_d_facto2_sched1_1d ...........................***Timeout 295.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.616479e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.004836e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.168287e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.808859e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.504267e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.194096e-03 s Time to initialize coeftab 3.918967e-02 s Time to factorize 2.490772e-01 s (40.09 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 5.011710e-01 s Time for refinement 1.415286e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.832069e-16 max(|| b_i - A x_i ||_1) 1.578672e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.983737e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.832069e-16 max(|| b_i - A x_i ||_1) 1.578672e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.983737e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.832069e-16 max(|| b_i - A x_i ||_1) 1.578672e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.983737e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.832069e-16 max(|| b_i - A x_i ||_1) 1.578672e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.983737e-03 (SUCCESS) Start 1943: mpi_rep_example_simple_lap_d_facto2_sched1_1d 1622/3626 Test #1944: mpi_rep_example_simple_lap_c_facto0_sched1_1d ...........................***Timeout 295.07 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.065540e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.752266e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.393533e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.287799e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.171799e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.873083e-03 s Time to initialize coeftab 3.020223e-01 s Time to factorize 3.189423e+00 s ( 6.36 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 8.095437e-01 s Time for refinement 2.105452e+00 s || A ||_1 5.112398e-02 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.864061e-07 max(|| b_i - A x_i ||_1) 8.478496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.139378e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.864061e-07 max(|| b_i - A x_i ||_1) 8.478496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.139378e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.864061e-07 max(|| b_i - A x_i ||_1) 8.478496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.139378e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.864061e-07 max(|| b_i - A x_i ||_1) 8.478496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.139378e+00 (SUCCESS) Start 1944: mpi_rep_example_simple_lap_c_facto0_sched1_1d 1622/3626 Test #1948: mpi_rep_example_simple_lap_c_facto4_sched1_1d ...........................***Timeout 293.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.725963e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.039107e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.061327e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.067445e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.586298e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.123575e-03 s Time to initialize coeftab 1.067321e-01 s Time to factorize 1.296535e+00 s (16.43 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 3.787272e+00 s Time for refinement 1.270709e+00 s || A ||_1 5.112398e-02 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812680e-07 || A ||_1 5.112398e-02 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812680e-07 max(|| b_i - A x_i ||_1) 8.070607e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.036456e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.070607e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.036456e+00 (SUCCESS) max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812680e-07 max(|| b_i - A x_i ||_1) 8.070607e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.036456e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812680e-07 max(|| b_i - A x_i ||_1) 8.070607e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.036456e+00 (SUCCESS) Start 1948: mpi_rep_example_simple_lap_c_facto4_sched1_1d 1622/3626 Test #1952: mpi_rep_example_simple_lap_z_facto3_sched1_1d ...........................***Timeout 290.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.142177e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.221065e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.314050e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.126197e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.365732e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.772229e-03 s Time to initialize coeftab 6.895661e-01 s Time to factorize 1.576559e+00 s (12.86 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 1.646216e+00 s Time for refinement 2.178194e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.244613e-16 max(|| b_i - A x_i ||_1) 1.812670e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.573981e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.244613e-16 max(|| b_i - A x_i ||_1) 1.812670e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.573981e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.244613e-16 max(|| b_i - A x_i ||_1) 1.812670e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.573981e-03 (SUCCESS) max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.244613e-16 max(|| b_i - A x_i ||_1) 1.812670e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.573981e-03 (SUCCESS) Start 1952: mpi_rep_example_simple_lap_z_facto3_sched1_1d 1622/3626 Test #1956: mpi_rep_example_simple_lap_s_facto2_sched4_1d ...........................***Timeout 289.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.407425e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.878120e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.370616e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.875003e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.495367e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.196987e-03 s Time to initialize coeftab 1.566657e-01 s Time to factorize 3.603026e+00 s ( 2.77 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 4.302536e+00 s Time for refinement 1.563498e+00 s || A ||_1 5.112398e-02 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.718177e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.718177e-07 max(|| b_i - A x_i ||_1) 7.701261e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.677134e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.701261e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.677134e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.718177e-07 max(|| b_i - A x_i ||_1) 7.701261e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.677134e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.718177e-07 max(|| b_i - A x_i ||_1) 7.701261e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.677134e-01 (SUCCESS) Start 1956: mpi_rep_example_simple_lap_s_facto2_sched4_1d 1622/3626 Test #1957: mpi_rep_example_simple_lap_d_facto0_sched4_1d ...........................***Timeout 289.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.007358e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.860707e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.151305e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.376475e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.742017e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.088092e-03 s Time to initialize coeftab 9.615384e-02 s Time to factorize 5.822683e+00 s (890.28 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 4.494266e+00 s Time for refinement 3.112277e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.020637e-16 max(|| b_i - A x_i ||_1) 1.752444e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.202097e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.020637e-16 max(|| b_i - A x_i ||_1) 1.752444e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.202097e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.020637e-16 max(|| b_i - A x_i ||_1) 1.752444e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.202097e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.020637e-16 max(|| b_i - A x_i ||_1) 1.752444e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.202097e-03 (SUCCESS) Start 1957: mpi_rep_example_simple_lap_d_facto0_sched4_1d 1622/3626 Test #1959: mpi_rep_example_simple_lap_d_facto2_sched4_1d ...........................***Timeout 289.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.225703e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.723670e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.194573e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.636594e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.965982e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.027222e-02 s Time to initialize coeftab 3.936643e-01 s Time to factorize 3.456605e+00 s ( 2.89 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 3.323771e+00 s Time for refinement 7.240711e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.827282e-16 || A ||_1 5.112481e-02 max(|| b_i - A x_i ||_1) 1.596344e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.005943e-03 (SUCCESS) max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.827282e-16 max(|| b_i - A x_i ||_1) 1.596344e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.005943e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.827282e-16 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.827282e-16 max(|| b_i - A x_i ||_1) 1.596344e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.005943e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.596344e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.005943e-03 (SUCCESS) Start 1959: mpi_rep_example_simple_lap_d_facto2_sched4_1d 1622/3626 Test #1961: mpi_rep_example_simple_lap_c_facto1_sched4_1d ...........................***Timeout 288.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.278972e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.736715e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.241281e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.774943e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.868643e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.512189e-03 s Time to initialize coeftab 1.479104e-01 s Time to factorize 1.843249e+00 s (11.56 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 2.352569e+00 s Time for refinement 1.292124e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.829165e-07 max(|| b_i - A x_i ||_1) 8.071240e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.036615e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.829165e-07 max(|| b_i - A x_i ||_1) 8.071240e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.036615e+00 (SUCCESS) || A ||_1 5.112398e-02 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.829165e-07 max(|| b_i - A x_i ||_1) 8.071240e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.036615e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.829165e-07 max(|| b_i - A x_i ||_1) 8.071240e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.036615e+00 (SUCCESS) Start 1961: mpi_rep_example_simple_lap_c_facto1_sched4_1d 1622/3626 Test #1964: mpi_rep_example_simple_lap_c_facto4_sched4_1d ...........................***Timeout 289.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.980310e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.958524e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.375514e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.031326e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.042167e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.552911e-03 s Time to initialize coeftab 9.850186e-01 s Time to factorize 3.583537e+00 s ( 5.95 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.495855e+00 s Time for refinement 1.384985e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112398e-02 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.793766e-07 max(|| b_i - A x_i ||_1) 7.979565e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.013483e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.793766e-07 max(|| b_i - A x_i ||_1) 7.979565e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.013483e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.793766e-07 max(|| b_i - A x_i ||_1) 7.979565e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.013483e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.793766e-07 max(|| b_i - A x_i ||_1) 7.979565e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.013483e+00 (SUCCESS) Start 1964: mpi_rep_example_simple_lap_c_facto4_sched4_1d Test #1553: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtbegin .......***Timeout 289.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.090709e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.284651e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.184739e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.866269e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.146871e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.021643e-02 s Time to initialize coeftab 1.016020e+00 s Time to factorize 7.506731e+00 s ( 5.32 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1 Mo / 1 Mo Time to solve 2.272653e+00 s - iteration 1 : total iteration time 2.45 s error 6.3363e-12 Time for refinement 4.245188e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.412039e-08 max(|| b_i - A x_i ||_1) 3.197147e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.067358e-01 (SUCCESS) Start 1553: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtbegin 1622/3626 Test #1971: mpi_dst_example_simple_lap_s_facto1_sched0_1d ...........................***Timeout 290.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.593744e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.174451e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.117696e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.990823e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.230061e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.679702e-01 s Time to initialize coeftab 8.508843e-02 s Time to factorize 8.627172e-01 s ( 6.07 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 68.5 Ko Time to solve 3.278156e-01 s Time for refinement 4.137558e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.727899e-07 max(|| b_i - A x_i ||_1) 7.535164e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.468614e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.727899e-07 max(|| b_i - A x_i ||_1) 7.535164e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.468614e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.727899e-07 max(|| b_i - A x_i ||_1) 7.535164e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.468614e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.727899e-07 max(|| b_i - A x_i ||_1) 7.535164e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.468614e-01 (SUCCESS) Start 1971: mpi_dst_example_simple_lap_s_facto1_sched0_1d 1622/3626 Test #1974: mpi_dst_example_simple_lap_d_facto1_sched0_1d ...........................***Timeout 289.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.640523e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.737961e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.060080e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.523901e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.048106e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.645608e-01 s Time to initialize coeftab 1.077611e-01 s Time to factorize 1.108847e+00 s ( 4.72 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 2.156583e-01 s Time for refinement 2.246544e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.853319e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.853319e-16 max(|| b_i - A x_i ||_1) 1.643660e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.065400e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.853319e-16 max(|| b_i - A x_i ||_1) 1.643660e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.065400e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.853319e-16 max(|| b_i - A x_i ||_1) 1.643660e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.065400e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.643660e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.065400e-03 (SUCCESS) Start 1974: mpi_dst_example_simple_lap_d_facto1_sched0_1d 1622/3626 Test #1975: mpi_dst_example_simple_lap_d_facto2_sched0_1d ...........................***Timeout 289.40 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.386797e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.590770e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.521638e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.292297e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.823061e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.235464e-02 s Time to initialize coeftab 1.450386e+00 s Time to factorize 5.707101e-01 s (17.50 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 9.917636e-02 s Time for refinement 9.238802e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.907938e-16 max(|| b_i - A x_i ||_1) 1.623087e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.039548e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.907938e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.907938e-16 max(|| b_i - A x_i ||_1) 1.623087e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.039548e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.907938e-16 max(|| b_i - A x_i ||_1) 1.623087e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.039548e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.623087e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.039548e-03 (SUCCESS) Start 1975: mpi_dst_example_simple_lap_d_facto2_sched0_1d 1622/3626 Test #1976: mpi_dst_example_simple_lap_c_facto0_sched0_1d ...........................***Timeout 289.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 3: 200 660 2: 200 760 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.234207e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.415384e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.201805e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.194261e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.289756e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.613321e-02 s Time to initialize coeftab 1.445622e+00 s Time to factorize 2.652535e-01 s (76.46 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.710485e-02 s Time for refinement 1.126367e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.901771e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.901771e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.901771e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.901771e-07 max(|| b_i - A x_i ||_1) 8.648711e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.182371e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.648711e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.182371e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.648711e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.182371e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.648711e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.182371e+00 (SUCCESS) Start 1976: mpi_dst_example_simple_lap_c_facto0_sched0_1d 1622/3626 Test #1979: mpi_dst_example_simple_lap_c_facto3_sched0_1d ...........................***Timeout 288.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.045886e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.020868e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.768828e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.903873e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.241507e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.153214e-02 s Time to initialize coeftab 1.236153e-01 s Time to factorize 1.140010e-01 s (177.90 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.294814e-02 s Time for refinement 8.491245e-03 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896020e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896020e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896020e-07 max(|| b_i - A x_i ||_1) 8.632809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.178358e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.632809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.178358e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.632809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.178358e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896020e-07 max(|| b_i - A x_i ||_1) 8.632809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.178358e+00 (SUCCESS) Start 1979: mpi_dst_example_simple_lap_c_facto3_sched0_1d 1622/3626 Test #1980: mpi_dst_example_simple_lap_c_facto4_sched0_1d ...........................***Timeout 288.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.993720e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.678893e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.438935e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.044986e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.304645e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.502502e-02 s Time to initialize coeftab 3.381822e-01 s Time to factorize 1.204712e-01 s (176.87 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.656823e-02 s Time for refinement 8.561818e-03 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.796141e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.796141e-07 max(|| b_i - A x_i ||_1) 7.886904e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.990140e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.796141e-07 max(|| b_i - A x_i ||_1) 7.886904e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.990140e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.796141e-07 max(|| b_i - A x_i ||_1) 7.886904e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.990140e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 7.886904e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.990140e+00 (SUCCESS) Start 1980: mpi_dst_example_simple_lap_c_facto4_sched0_1d Test #1566: shm_example_simple_lap_c_facto3_sched4_kway_pqrcpend ....................***Timeout 287.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.384163e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.192749e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.707809e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.276442e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.633325e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.190546e-03 s Time to initialize coeftab 2.907046e+00 s Time to factorize 2.675742e+00 s ( 7.58 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.649921e+00 s Time for refinement 8.542128e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.061547e-07 max(|| b_i - A x_i ||_1) 9.164778e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.312547e+00 (SUCCESS) Start 1566: shm_example_simple_lap_c_facto3_sched4_kway_pqrcpend 1622/3626 Test #1981: mpi_dst_example_simple_lap_z_facto0_sched0_1d ...........................***Timeout 287.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.617088e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.640853e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.216096e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.545124e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.695000e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.410671e-02 s Time to initialize coeftab 1.461009e-01 s Time to factorize 3.887363e-01 s (52.17 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 3.598718e-02 s Time for refinement 2.365510e-02 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.226520e-16 max(|| b_i - A x_i ||_1) 1.861569e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.697370e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.226520e-16 max(|| b_i - A x_i ||_1) 1.861569e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.697370e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.226520e-16 max(|| b_i - A x_i ||_1) 1.861569e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.697370e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.226520e-16 max(|| b_i - A x_i ||_1) 1.861569e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.697370e-03 (SUCCESS) Start 1981: mpi_dst_example_simple_lap_z_facto0_sched0_1d 1622/3626 Test #1982: mpi_dst_example_simple_lap_z_facto1_sched0_1d ...........................***Timeout 286.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.946764e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.883691e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.807039e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.040820e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.061318e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.917975e-01 s Time to initialize coeftab 2.380925e-01 s Time to factorize 2.279225e-01 s (93.49 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 1.723213e-02 s Time for refinement 1.141808e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.076580e-16 max(|| b_i - A x_i ||_1) 1.750298e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.416596e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.076580e-16 max(|| b_i - A x_i ||_1) 1.750298e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.416596e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.076580e-16 max(|| b_i - A x_i ||_1) 1.750298e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.416596e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.076580e-16 max(|| b_i - A x_i ||_1) 1.750298e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.416596e-03 (SUCCESS) Start 1982: mpi_dst_example_simple_lap_z_facto1_sched0_1d Test #1571: shm_example_simple_lap_c_facto3_sched4_kway_rqrcpbegin ..................***Timeout 286.45 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.129801e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.661769e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.889385e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 2.978453e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.189241e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.442649e-02 s Time to initialize coeftab 1.927478e+00 s Time to factorize 4.790953e+00 s ( 4.23 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.297348e+00 s - iteration 1 : total iteration time 3.34 s error 5.1008e-11 Time for refinement 4.842119e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822262e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.306195e-08 max(|| b_i - A x_i ||_1) 3.228979e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.147680e-01 (SUCCESS) Start 1571: shm_example_simple_lap_c_facto3_sched4_kway_rqrcpbegin 1622/3626 Test #1983: mpi_dst_example_simple_lap_z_facto2_sched0_1d ...........................***Timeout 286.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.586810e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.453753e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.895442e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.210085e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.658875e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.922360e-02 s Time to initialize coeftab 8.182320e-02 s Time to factorize 1.891007e+00 s (21.14 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Memory usage of coeftab 548 Ko Time to solve 7.200779e-02 s Time for refinement 4.884789e-02 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.204884e-16 max(|| b_i - A x_i ||_1) 1.691585e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.268442e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.204884e-16 max(|| b_i - A x_i ||_1) 1.691585e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.268442e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.204884e-16 max(|| b_i - A x_i ||_1) 1.691585e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.268442e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.204884e-16 max(|| b_i - A x_i ||_1) 1.691585e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.268442e-03 (SUCCESS) Start 1983: mpi_dst_example_simple_lap_z_facto2_sched0_1d 1622/3626 Test #1986: mpi_dst_example_simple_lap_s_facto0_sched1_1d ...........................***Timeout 285.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.620526e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.186708e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.962034e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.290444e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.712894e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.130084e-01 s Time to initialize coeftab 1.910538e-01 s Time to factorize 4.685894e-01 s (10.80 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Memory usage of coeftab 68.5 Ko Time to solve 4.541405e-01 s Time for refinement 7.301708e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.880132e-07 max(|| b_i - A x_i ||_1) 8.525554e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.071313e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.880132e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.880132e-07 max(|| b_i - A x_i ||_1) 8.525554e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.071313e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.525554e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.071313e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.880132e-07 max(|| b_i - A x_i ||_1) 8.525554e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.071313e+00 (SUCCESS) Start 1986: mpi_dst_example_simple_lap_s_facto0_sched1_1d 1622/3626 Test #1991: mpi_dst_example_simple_lap_d_facto2_sched1_1d ...........................***Timeout 282.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.337313e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.895863e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.441794e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.379750e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.383999e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.669949e-02 s Time to initialize coeftab 1.424654e-01 s Time to factorize 4.409401e-01 s (22.64 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 4.190638e-01 s Time for refinement 7.261652e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.898372e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.898372e-16 max(|| b_i - A x_i ||_1) 1.620014e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.035686e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.898372e-16 max(|| b_i - A x_i ||_1) 1.620014e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.035686e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.620014e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.035686e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.898372e-16 max(|| b_i - A x_i ||_1) 1.620014e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.035686e-03 (SUCCESS) Start 1991: mpi_dst_example_simple_lap_d_facto2_sched1_1d 1622/3626 Test #1992: mpi_dst_example_simple_lap_c_facto0_sched1_1d ...........................***Timeout 282.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.444641e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.106587e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.735458e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.844015e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.662844e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.741259e-01 s Time to initialize coeftab 1.541305e-01 s Time to factorize 8.487520e-01 s (23.89 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.600751e+00 s Time for refinement 8.092032e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.957017e-07 max(|| b_i - A x_i ||_1) 8.824143e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.226638e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.957017e-07 max(|| b_i - A x_i ||_1) 8.824143e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.226638e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.957017e-07 max(|| b_i - A x_i ||_1) 8.824143e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.226638e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.957017e-07 max(|| b_i - A x_i ||_1) 8.824143e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.226638e+00 (SUCCESS) Start 1992: mpi_dst_example_simple_lap_c_facto0_sched1_1d 1622/3626 Test #1993: mpi_dst_example_simple_lap_c_facto1_sched1_1d ...........................***Timeout 280.58 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.766694e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.454623e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.191921e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.292723e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.854362e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.551638e-02 s Time to initialize coeftab 1.521955e-01 s Time to factorize 4.911373e-01 s (43.38 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 8.887028e-01 s Time for refinement 1.474945e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812844e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812844e-07 max(|| b_i - A x_i ||_1) 7.962694e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.009265e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812844e-07 max(|| b_i - A x_i ||_1) 7.962694e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.009265e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812844e-07 max(|| b_i - A x_i ||_1) 7.962694e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.009265e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 7.962694e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.009265e+00 (SUCCESS) Start 1993: mpi_dst_example_simple_lap_c_facto1_sched1_1d Test #1106: shm_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtend .........***Timeout 276.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.002941e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.621739e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.722687e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.054681e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.076959e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.600942e-03 s Time to initialize coeftab 1.459665e-01 s Time to factorize 6.046166e+00 s ( 3.52 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 8.313838e-01 s Time for refinement 5.750563e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.056947e-07 max(|| b_i - A x_i ||_1) 8.842385e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.231198e+00 (SUCCESS) 1623/3626 Test #2003: mpi_dst_example_simple_lap_s_facto1_sched4_1d ...........................***Timeout 275.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.769423e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.096621e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.008466e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.612416e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.872396e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.709956e-02 s Time to initialize coeftab 2.625289e-01 s Time to factorize 1.573119e+00 s ( 3.33 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 68.5 Ko Time to solve 2.410296e+00 s Time for refinement 1.096610e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702082e-07 max(|| b_i - A x_i ||_1) 7.415321e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.318020e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702082e-07 max(|| b_i - A x_i ||_1) 7.415321e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.318020e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702082e-07 max(|| b_i - A x_i ||_1) 7.415321e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.318020e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702082e-07 max(|| b_i - A x_i ||_1) 7.415321e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.318020e-01 (SUCCESS) Start 2003: mpi_dst_example_simple_lap_s_facto1_sched4_1d Test #1591: shm_example_simple_lap_c_facto4_sched4_kway_svdbegin ....................***Timeout 274.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.040118e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.201428e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.797153e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.400296e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.062892e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.398263e-02 s Time to initialize coeftab 8.376904e-01 s Time to factorize 3.910067e+00 s ( 5.45 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.866051e+00 s Time for refinement 7.700457e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.145875e-07 max(|| b_i - A x_i ||_1) 9.631081e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.430210e+00 (SUCCESS) Start 1591: shm_example_simple_lap_c_facto4_sched4_kway_svdbegin 1623/3626 Test #2004: mpi_dst_example_simple_lap_s_facto2_sched4_1d ...........................***Timeout 274.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.737247e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.906386e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.714941e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.026562e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.861657e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.448655e-01 s Time to initialize coeftab 4.654826e-02 s Time to factorize 1.184709e+00 s ( 8.43 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 4.410173e+00 s Time for refinement 8.275620e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.711419e-07 max(|| b_i - A x_i ||_1) 7.463102e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.378062e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.711419e-07 max(|| b_i - A x_i ||_1) 7.463102e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.378062e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.711419e-07 max(|| b_i - A x_i ||_1) 7.463102e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.378062e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.711419e-07 max(|| b_i - A x_i ||_1) 7.463102e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.378062e-01 (SUCCESS) Start 2004: mpi_dst_example_simple_lap_s_facto2_sched4_1d Start 2104: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpbegin Start 2105: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpend Start 2106: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrrtbegin Start 2107: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrrtend Start 2108: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrrtbegin Start 2109: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrrtend Start 2110: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtbegin Start 2111: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtend Start 2112: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpilu0 Start 2113: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpilu1 Start 2114: mpi_dst_example_simple_lap_d_facto0_sched0_not_svdbegin Start 2115: mpi_dst_example_simple_lap_d_facto0_sched0_not_svdend Start 2116: mpi_dst_example_simple_lap_d_facto0_sched0_kway_svdbegin Start 2117: mpi_dst_example_simple_lap_d_facto0_sched0_kway_svdend Start 2118: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_svdbegin Start 2119: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_svdend Start 2120: mpi_dst_example_simple_lap_d_facto0_sched0_not_pqrcpbegin Start 2121: mpi_dst_example_simple_lap_d_facto0_sched0_not_pqrcpend Start 2122: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpbegin Start 2123: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpend Start 2124: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpbegin Start 2125: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpend Start 2126: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrcpbegin Start 2127: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrcpend Start 2128: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrcpbegin Start 2129: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrcpend Start 2130: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpbegin Start 2131: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpend Start 2132: mpi_dst_example_simple_lap_d_facto0_sched0_not_tqrcpbegin Start 2133: mpi_dst_example_simple_lap_d_facto0_sched0_not_tqrcpend Start 2134: mpi_dst_example_simple_lap_d_facto0_sched0_kway_tqrcpbegin 1623/3626 Test #2017: mpi_dst_example_simple_lap_z_facto4_sched4_1d ........................... Passed 261.45 sec 1624/3626 Test #2016: mpi_dst_example_simple_lap_z_facto3_sched4_1d ........................... Passed 263.11 sec 1625/3626 Test #2011: mpi_dst_example_simple_lap_c_facto3_sched4_1d ........................... Passed 264.92 sec Test #1599: shm_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpbegin ....... Passed 271.50 sec 1627/3626 Test #2013: mpi_dst_example_simple_lap_z_facto0_sched4_1d ........................... Passed 264.43 sec Test #1600: shm_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpend ......... Passed 271.50 sec 1629/3626 Test #2031: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrcpend ................. Passed 241.89 sec 1630/3626 Test #2021: mpi_dst_example_simple_lap_s_facto0_sched0_kway_svdend .................. Passed 253.38 sec Test #1603: shm_example_simple_lap_c_facto4_sched4_kway_rqrcpbegin .................. Passed 271.41 sec 1632/3626 Test #2029: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_pqrcpend ..... Passed 243.43 sec 1633/3626 Test #2062: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrcpbegin ............... Passed 224.15 sec 1634/3626 Test #2018: mpi_dst_example_simple_lap_s_facto0_sched0_not_svdbegin ................. Passed 260.82 sec 1635/3626 Test #2023: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_svdend ....... Passed 247.92 sec 1636/3626 Test #2015: mpi_dst_example_simple_lap_z_facto2_sched4_1d ........................... Passed 264.18 sec 1637/3626 Test #2079: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_rqrrtend ..... Passed 212.70 sec Test #1634: shm_example_simple_lap_z_facto0_sched4_not_rqrcpend ..................... Passed 254.70 sec 1639/3626 Test #2067: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpend ..... Passed 222.28 sec 1640/3626 Test #2078: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_rqrrtbegin ... Passed 212.82 sec 1641/3626 Test #2086: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_svdbegin ..... Passed 208.46 sec 1642/3626 Test #2061: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpend ..... Passed 224.26 sec Test #1666: shm_example_simple_lap_z_facto1_sched4_not_rqrcpend ..................... Passed 225.06 sec 1644/3626 Test #2073: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpend ..... Passed 217.52 sec 1645/3626 Test #2081: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpilu1 ............... Passed 211.77 sec 1646/3626 Test #2080: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpilu0 ............... Passed 212.16 sec Test #1672: shm_example_simple_lap_z_facto1_sched4_not_tqrcpend ..................... Passed 221.42 sec 1648/3626 Test #2085: mpi_dst_example_simple_lap_s_facto2_sched0_kway_svdend .................. Passed 208.54 sec Test #1203: shm_example_simple_lap_z_facto2_sched1_kway_pqrcpilu0 ................... Passed 212.82 sec Test #1682: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtend ......... Passed 220.04 sec 1651/3626 Test #2084: mpi_dst_example_simple_lap_s_facto2_sched0_kway_svdbegin ................ Passed 208.98 sec 1652/3626 Test #2077: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrrtend ................ Passed 214.25 sec 1653/3626 Test #2076: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrrtbegin .............. Passed 214.87 sec 1654/3626 Test #2074: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrrtbegin ............... Passed 216.99 sec 1655/3626 Test #2083: mpi_dst_example_simple_lap_s_facto2_sched0_not_svdend ................... Passed 209.04 sec Test #1736: shm_example_simple_lap_z_facto3_sched4_not_tqrcpend ..................... Passed 174.84 sec 1657/3626 Test #2092: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_pqrcpbegin ... Passed 201.44 sec Test #1810: c_mpi_rep_example_simple_lap_z_facto2 ................................... Passed 120.75 sec 1659/3626 Test #1845: c_mpi_rep_example_step-by-step_lap_s_facto0 .............................***Timeout 351.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1845: c_mpi_rep_example_step-by-step_lap_s_facto0 1659/3626 Test #1846: c_mpi_rep_example_step-by-step_lap_s_facto1 .............................***Timeout 351.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.778103e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.200904e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.625156e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.749165e-01 s Time to initialize internal csc 2.350583e-03 s Time to initialize coeftab 5.882401e-01 s Time to factorize 1.169686e+00 s ( 4.47 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 68.5 Ko Time to solve 3.712649e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 3.519528e+01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.176381e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.918301e-07 max(|| b_i - A x_i ||_1) 8.173989e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.049781e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 5.066395e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.014473e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.176381e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.176381e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.918301e-07 max(|| b_i - A x_i ||_1) 8.173989e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.049781e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 5.066395e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.014473e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.918301e-07 max(|| b_i - A x_i ||_1) 8.173989e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.049781e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 5.066395e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.014473e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.176381e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.918301e-07 max(|| b_i - A x_i ||_1) 8.173989e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.049781e+00 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 5.066395e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.014473e+00 (SUCCESS) Time to solve 2.467603e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 1846: c_mpi_rep_example_step-by-step_lap_s_facto1 1659/3626 Test #1847: c_mpi_rep_example_step-by-step_lap_s_facto2 .............................***Timeout 350.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1847: c_mpi_rep_example_step-by-step_lap_s_facto2 1659/3626 Test #1849: c_mpi_rep_example_step-by-step_lap_d_facto1 .............................***Timeout 349.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.824690e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.488498e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.852699e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.074465e+00 s Time to initialize internal csc 3.767455e-02 s Time to initialize coeftab 1.162175e-01 s Time to factorize 8.114219e-01 s ( 6.45 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 3.559606e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 1.457186e+01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.140708e-16 max(|| b_i - A x_i ||_1) 1.661321e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.144030e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.140708e-16 max(|| b_i - A x_i ||_1) 1.661321e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.144030e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.140708e-16 max(|| b_i - A x_i ||_1) 1.661321e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.144030e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.110223e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.223106e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.110223e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.223106e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.110223e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.223106e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.140708e-16 max(|| b_i - A x_i ||_1) 1.661321e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.144030e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.110223e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.223106e-03 (SUCCESS) Time to solve 3.041599e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 3.223239e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.182948e-16 max(|| b_i - A x_i ||_1) 1.676122e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.168265e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.182948e-16 max(|| b_i - A x_i ||_1) 1.676122e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.168265e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.182948e-16 max(|| b_i - A x_i ||_1) 1.676122e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.168265e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) Time to initialize internal csc 1.275954e-02 s max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.182948e-16 max(|| b_i - A x_i ||_1) 1.676122e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.168265e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) Time to initialize coeftab 8.575242e-02 s Time to factorize 1.511827e+00 s ( 3.46 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.448807e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 1.656437e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.182750e-16 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i - A x_i ||_1) 1.673100e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.166653e-03 (SUCCESS) max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.182750e-16 max(|| b_i - A x_i ||_1) 1.673100e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.166653e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.182750e-16 max(|| b_i - A x_i ||_1) 1.673100e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.166653e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.182750e-16 max(|| b_i - A x_i ||_1) 1.673100e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.166653e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) Time to solve 1.920355e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 1849: c_mpi_rep_example_step-by-step_lap_d_facto1 1659/3626 Test #1850: c_mpi_rep_example_step-by-step_lap_d_facto2 .............................***Timeout 346.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1850: c_mpi_rep_example_step-by-step_lap_d_facto2 1659/3626 Test #1852: c_mpi_rep_example_step-by-step_lap_c_facto1 .............................***Timeout 327.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1852: c_mpi_rep_example_step-by-step_lap_c_facto1 1659/3626 Test #1853: c_mpi_rep_example_step-by-step_lap_c_facto2 .............................***Timeout 327.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1853: c_mpi_rep_example_step-by-step_lap_c_facto2 1659/3626 Test #1854: c_mpi_rep_example_step-by-step_lap_c_facto3 .............................***Timeout 327.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1854: c_mpi_rep_example_step-by-step_lap_c_facto3 1659/3626 Test #1856: c_mpi_rep_example_step-by-step_lap_z_facto0 .............................***Timeout 327.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1856: c_mpi_rep_example_step-by-step_lap_z_facto0 1659/3626 Test #1857: c_mpi_rep_example_step-by-step_lap_z_facto1 .............................***Timeout 327.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.026248e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.922776e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.784331e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.920664e+00 s Time to initialize internal csc 9.415446e-02 s Time to initialize coeftab 1.173955e+00 s Time to factorize 1.498795e+00 s (14.22 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 2.813354e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 3.120415e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.048391e-16 max(|| b_i - A x_i ||_1) 1.694504e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.275809e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.217455e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.738411e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.048391e-16 max(|| b_i - A x_i ||_1) 1.694504e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.275809e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.217455e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.738411e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.048391e-16 max(|| b_i - A x_i ||_1) 1.694504e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.275809e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.048391e-16 max(|| b_i - A x_i ||_1) 1.694504e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.275809e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.217455e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.738411e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.217455e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.738411e-03 (SUCCESS) Time to solve 3.679182e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 2.213142e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.764616e-02 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.019196e-16 max(|| b_i - A x_i ||_1) 1.693478e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.273219e-03 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.217455e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.738411e-03 (SUCCESS) || A ||_1 5.112481e-02 Start 1857: c_mpi_rep_example_step-by-step_lap_z_facto1 Test #1360: shm_example_simple_lap_s_facto2_sched4_kway_rqrrtend ....................***Timeout 326.80 sec Start 1360: shm_example_simple_lap_s_facto2_sched4_kway_rqrrtend Test #1361: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtbegin .......***Timeout 326.79 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 1361: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtbegin Test #1372: shm_example_simple_lap_d_facto0_sched4_not_pqrcpend .....................***Timeout 326.28 sec Start 1372: shm_example_simple_lap_d_facto0_sched4_not_pqrcpend Test #1374: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpend ....................***Timeout 325.95 sec Start 1374: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpend Test #1378: shm_example_simple_lap_d_facto0_sched4_not_rqrcpend .....................***Timeout 325.91 sec Start 1378: shm_example_simple_lap_d_facto0_sched4_not_rqrcpend Test #1381: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpbegin .......***Timeout 325.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 1381: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpbegin Test #1397: shm_example_simple_lap_d_facto1_sched4_not_svdbegin .....................***Timeout 325.19 sec Start 1397: shm_example_simple_lap_d_facto1_sched4_not_svdbegin Test #1427: shm_example_simple_lap_d_facto1_sched4_kway_pqrcpilu0 ...................***Timeout 323.73 sec Start 1427: shm_example_simple_lap_d_facto1_sched4_kway_pqrcpilu0 Test #1437: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpbegin ..................***Timeout 323.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 1437: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpbegin Test #1453: shm_example_simple_lap_d_facto2_sched4_not_rqrrtbegin ...................***Timeout 322.16 sec Start 1453: shm_example_simple_lap_d_facto2_sched4_not_rqrrtbegin Test #1455: shm_example_simple_lap_d_facto2_sched4_kway_rqrrtbegin ..................***Timeout 322.11 sec Start 1455: shm_example_simple_lap_d_facto2_sched4_kway_rqrrtbegin Test #1463: shm_example_simple_lap_c_facto0_sched4_kway_svdbegin ....................***Timeout 321.82 sec Start 1463: shm_example_simple_lap_c_facto0_sched4_kway_svdbegin Test #1471: shm_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpbegin .......***Timeout 321.71 sec Start 1471: shm_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpbegin Test #1473: shm_example_simple_lap_c_facto0_sched4_not_rqrcpbegin ...................***Timeout 321.67 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1473: shm_example_simple_lap_c_facto0_sched4_not_rqrcpbegin Test #1475: shm_example_simple_lap_c_facto0_sched4_kway_rqrcpbegin ..................***Timeout 321.63 sec Start 1475: shm_example_simple_lap_c_facto0_sched4_kway_rqrcpbegin Test #1476: shm_example_simple_lap_c_facto0_sched4_kway_rqrcpend ....................***Timeout 321.59 sec Start 1476: shm_example_simple_lap_c_facto0_sched4_kway_rqrcpend Test #1479: shm_example_simple_lap_c_facto0_sched4_not_tqrcpbegin ...................***Timeout 321.55 sec Start 1479: shm_example_simple_lap_c_facto0_sched4_not_tqrcpbegin Test #1482: shm_example_simple_lap_c_facto0_sched4_kway_tqrcpend ....................***Timeout 321.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1482: shm_example_simple_lap_c_facto0_sched4_kway_tqrcpend Test #1490: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtend .........***Timeout 321.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1490: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtend Test #1491: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpilu0 ...................***Timeout 321.12 sec Start 1491: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpilu0 Test #1495: shm_example_simple_lap_c_facto1_sched4_kway_svdbegin ....................***Timeout 321.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1495: shm_example_simple_lap_c_facto1_sched4_kway_svdbegin Test #1497: shm_example_simple_lap_c_facto1_sched4_kwayprojections_svdbegin .........***Timeout 320.98 sec Start 1497: shm_example_simple_lap_c_facto1_sched4_kwayprojections_svdbegin Test #1500: shm_example_simple_lap_c_facto1_sched4_not_pqrcpend .....................***Timeout 320.98 sec Start 1500: shm_example_simple_lap_c_facto1_sched4_not_pqrcpend Test #1511: shm_example_simple_lap_c_facto1_sched4_not_tqrcpbegin ...................***Timeout 320.45 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1511: shm_example_simple_lap_c_facto1_sched4_not_tqrcpbegin Test #1513: shm_example_simple_lap_c_facto1_sched4_kway_tqrcpbegin ..................***Timeout 320.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1513: shm_example_simple_lap_c_facto1_sched4_kway_tqrcpbegin Test #1514: shm_example_simple_lap_c_facto1_sched4_kway_tqrcpend ....................***Timeout 320.42 sec Start 1514: shm_example_simple_lap_c_facto1_sched4_kway_tqrcpend Test #1515: shm_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpbegin .......***Timeout 320.40 sec Start 1515: shm_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpbegin Test #1517: shm_example_simple_lap_c_facto1_sched4_not_rqrrtbegin ...................***Timeout 320.26 sec Start 1517: shm_example_simple_lap_c_facto1_sched4_not_rqrrtbegin Test #1522: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtend .........***Timeout 320.08 sec Start 1522: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtend Test #1523: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpilu0 ...................***Timeout 320.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1523: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpilu0 Test #1526: shm_example_simple_lap_c_facto2_sched4_not_svdend .......................***Timeout 319.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1526: shm_example_simple_lap_c_facto2_sched4_not_svdend Test #1528: shm_example_simple_lap_c_facto2_sched4_kway_svdend ......................***Timeout 319.86 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1528: shm_example_simple_lap_c_facto2_sched4_kway_svdend 1659/3626 Test #1858: c_mpi_rep_example_step-by-step_lap_z_facto2 .............................***Timeout 319.85 sec Start 1858: c_mpi_rep_example_step-by-step_lap_z_facto2 1659/3626 Test #1859: c_mpi_rep_example_step-by-step_lap_z_facto3 .............................***Timeout 319.85 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1859: c_mpi_rep_example_step-by-step_lap_z_facto3 1659/3626 Test #1860: c_mpi_rep_example_step-by-step_lap_z_facto4 .............................***Timeout 319.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.196313e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.044197e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.964837e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.734934e+00 s Time to initialize internal csc 1.992274e-02 s Time to initialize coeftab 5.579225e-01 s Time to factorize 1.818356e+00 s (11.72 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 2.504889e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 2.158518e+01 s Start 1860: c_mpi_rep_example_step-by-step_lap_z_facto4 1659/3626 Test #1861: c_mpi_rep_example_personal_lap_s_facto0 .................................***Timeout 319.84 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1861: c_mpi_rep_example_personal_lap_s_facto0 1659/3626 Test #1864: c_mpi_rep_example_personal_lap_d_facto0 .................................***Timeout 319.76 sec ischedInit: The thread number has been automatically set to 256 Start 1864: c_mpi_rep_example_personal_lap_d_facto0 Test #1039: shm_example_simple_lap_c_facto2_sched1_kway_rqrrtbegin ..................***Timeout 319.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #1047: shm_example_simple_lap_c_facto3_sched1_kway_svdbegin ....................***Timeout 319.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #1461: shm_example_simple_lap_c_facto0_sched4_not_svdbegin .....................***Timeout 319.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1461: shm_example_simple_lap_c_facto0_sched4_not_svdbegin Test #1465: shm_example_simple_lap_c_facto0_sched4_kwayprojections_svdbegin .........***Timeout 319.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1465: shm_example_simple_lap_c_facto0_sched4_kwayprojections_svdbegin Test #1525: shm_example_simple_lap_c_facto2_sched4_not_svdbegin .....................***Timeout 319.21 sec Start 1525: shm_example_simple_lap_c_facto2_sched4_not_svdbegin 1661/3626 Test #1866: c_mpi_rep_example_personal_lap_d_facto2 .................................***Timeout 318.97 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1866: c_mpi_rep_example_personal_lap_d_facto2 1661/3626 Test #1868: c_mpi_rep_example_personal_lap_c_facto1 .................................***Timeout 318.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Personal Ordering method is: Personal (myorder->permtab/peritab) Time to compute ordering 2.766543e-01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 499500 Fill-in of L 135.000000 Time to compute symbol matrix 5.825910e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.308846e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 499500 Fill-in 135.000000 Number of operations in full-rank: LDL^t 1.25 GFlops Prediction: Model AMD 6180 MKL Time to factorize 5.559740e-02 s Time for mapping/scheduling 5.127981e-01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.695761e-02 s Time to initialize coeftab 9.441592e-02 s Start 1868: c_mpi_rep_example_personal_lap_c_facto1 1661/3626 Test #1869: c_mpi_rep_example_personal_lap_c_facto2 .................................***Timeout 318.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1869: c_mpi_rep_example_personal_lap_c_facto2 1661/3626 Test #1870: c_mpi_rep_example_personal_lap_c_facto3 .................................***Timeout 318.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Personal Ordering method is: Personal (myorder->permtab/peritab) Time to compute ordering 5.865816e-01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 499500 Fill-in of L 135.000000 Time to compute symbol matrix 1.917686e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.837420e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 499500 Fill-in 135.000000 Number of operations in full-rank: LL^t 1.25 GFlops Prediction: Model AMD 6180 MKL Time to factorize 5.430933e-02 s Time for mapping/scheduling 6.177503e-01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.902547e-02 s Time to initialize coeftab 4.667464e-02 s Start 1870: c_mpi_rep_example_personal_lap_c_facto3 1661/3626 Test #1872: c_mpi_rep_example_personal_lap_z_facto0 .................................***Timeout 318.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1872: c_mpi_rep_example_personal_lap_z_facto0 1661/3626 Test #1873: c_mpi_rep_example_personal_lap_z_facto1 .................................***Timeout 318.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1873: c_mpi_rep_example_personal_lap_z_facto1 1661/3626 Test #1874: c_mpi_rep_example_personal_lap_z_facto2 .................................***Timeout 318.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1874: c_mpi_rep_example_personal_lap_z_facto2 1661/3626 Test #1875: c_mpi_rep_example_personal_lap_z_facto3 .................................***Timeout 318.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Personal Ordering method is: Personal (myorder->permtab/peritab) Time to compute ordering 5.407787e-01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 499500 Fill-in of L 135.000000 Time to compute symbol matrix 2.472788e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Start 1875: c_mpi_rep_example_personal_lap_z_facto3 1661/3626 Test #1876: c_mpi_rep_example_personal_lap_z_facto4 .................................***Timeout 318.71 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1876: c_mpi_rep_example_personal_lap_z_facto4 1661/3626 Test #1878: c_mpi_rep_example_simple_scotch_mm ......................................***Timeout 318.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1878: c_mpi_rep_example_simple_scotch_mm 1661/3626 Test #1880: c_mpi_rep_example_simple_scotch_mm2 .....................................***Timeout 318.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1880: c_mpi_rep_example_simple_scotch_mm2 1661/3626 Test #1881: c_mpi_rep_example_simple_single_rsa .....................................***Timeout 318.62 sec RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1881: c_mpi_rep_example_simple_single_rsa Test #1533: shm_example_simple_lap_c_facto2_sched4_kway_pqrcpbegin ..................***Timeout 318.45 sec Start 1533: shm_example_simple_lap_c_facto2_sched4_kway_pqrcpbegin 1661/3626 Test #1885: c_mpi_rep_example_step-by-step_single_rsa ...............................***Timeout 318.40 sec RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1885: c_mpi_rep_example_step-by-step_single_rsa 1661/3626 Test #1887: c_mpi_rep_example_step-by-step_single_hb ................................***Timeout 318.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1887: c_mpi_rep_example_step-by-step_single_hb 1661/3626 Test #1892: c_mpi_rep_example_refinement_lap_s_refine_cg_sym ........................***Timeout 318.16 sec Start 1892: c_mpi_rep_example_refinement_lap_s_refine_cg_sym 1661/3626 Test #1894: c_mpi_rep_example_refinement_lap_s_refine_bicgstab_sym ..................***Timeout 318.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1894: c_mpi_rep_example_refinement_lap_s_refine_bicgstab_sym 1661/3626 Test #1896: c_mpi_rep_example_refinement_lap_d_refine_gmres_sym .....................***Timeout 318.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.337263e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.690703e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.144688e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.064079e-01 s Time to initialize internal csc 1.702843e-02 s - iteration 1 : total iteration time 1.01 s error 0.20451 - iteration 2 : total iteration time 1.11 s error 0.05944 - iteration 3 : total iteration time 2.06 s error 0.019007 - iteration 4 : total iteration time 6.78 s error 0.0066596 - iteration 5 : total iteration time 3.08 s error 0.0023054 - iteration 6 : total iteration time 3.14 s error 0.00077935 - iteration 7 : total iteration time 6.95 s error 0.00027759 - iteration 8 : total iteration time 18.4 s error 9.3504e-05 - iteration 9 : total iteration time 7.81 s error 3.0631e-05 - iteration 10 : total iteration time 10.6 s error 1.0017e-05 - iteration 11 : total iteration time 3.79 s error 3.0969e-06 - iteration 12 : total iteration time 7.64 s error 9.333e-07 - iteration 13 : total iteration time 6.4 s error 2.7791e-07 - iteration 14 : total iteration time 3.29 s error 8.2065e-08 - iteration 15 : total iteration time 3.3 s error 2.3931e-08 Start 1896: c_mpi_rep_example_refinement_lap_d_refine_gmres_sym 1661/3626 Test #1900: c_mpi_rep_example_refinement_lap_c_refine_bicgstab_her ..................***Timeout 317.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex32 Format: CSC N: 1000 nnz: 11476 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.411109e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 84938 Fill-in of L 7.401359 Time to compute symbol matrix 1.909386e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.159274e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 169876 Fill-in 14.802719 Number of operations in full-rank: LU 62.91 MFlops Prediction: Model AMD 6180 MKL Time to factorize 2.138097e-03 s Time for mapping/scheduling 3.589649e+00 s Time to initialize internal csc 7.630370e-02 s - iteration 1 : total iteration time 5.78 s error 0.086615 - iteration 2 : total iteration time 1.83 s error 0.017746 - iteration 3 : total iteration time 1.26 s error 0.004478 - iteration 4 : total iteration time 1.34 s error 0.0011468 - iteration 5 : total iteration time 1.47 s error 0.00040184 - iteration 6 : total iteration time 2.61 s error 6.6774e-05 Start 1900: c_mpi_rep_example_refinement_lap_c_refine_bicgstab_her 1661/3626 Test #1901: c_mpi_rep_example_refinement_lap_c_refine_cg_sym ........................***Timeout 317.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1901: c_mpi_rep_example_refinement_lap_c_refine_cg_sym 1661/3626 Test #1904: c_mpi_rep_example_refinement_lap_z_refine_cg_her ........................***Timeout 317.72 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1904: c_mpi_rep_example_refinement_lap_z_refine_cg_her 1661/3626 Test #1905: c_mpi_rep_example_refinement_lap_z_refine_gmres_her .....................***Timeout 317.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1905: c_mpi_rep_example_refinement_lap_z_refine_gmres_her 1661/3626 Test #1906: c_mpi_rep_example_refinement_lap_z_refine_bicgstab_her ..................***Timeout 317.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.764213e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 84938 Fill-in of L 7.401359 Time to compute symbol matrix 3.136456e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.608143e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 169876 Fill-in 14.802719 Number of operations in full-rank: LU 62.91 MFlops Prediction: Model AMD 6180 MKL Time to factorize 2.138097e-03 s Time for mapping/scheduling 1.982763e+00 s Time to initialize internal csc 5.269329e-02 s - iteration 1 : total iteration time 11.6 s error 0.086615 - iteration 2 : total iteration time 13.4 s error 0.017746 - iteration 3 : total iteration time 5.98 s error 0.004478 - iteration 4 : total iteration time 3.51 s error 0.0011468 - iteration 5 : total iteration time 5.56 s error 0.00040184 - iteration 6 : total iteration time 4.92 s error 6.6774e-05 - iteration 7 : total iteration time 4.06 s error 9.0717e-06 - iteration 8 : total iteration time 2.25 s error 2.1003e-06 - iteration 9 : total iteration time 2.43 s error 6.593e-07 Start 1906: c_mpi_rep_example_refinement_lap_z_refine_bicgstab_her 1661/3626 Test #1908: c_mpi_rep_example_refinement_lap_z_refine_gmres_sym .....................***Timeout 317.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.333203e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.818793e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.400698e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.260887e-01 s Time to initialize internal csc 4.157099e-02 s - iteration 1 : total iteration time 5.96 s error 0.20013 - iteration 2 : total iteration time 1.83 s error 0.056488 - iteration 3 : total iteration time 1.56 s error 0.017842 - iteration 4 : total iteration time 1.65 s error 0.0060829 - iteration 5 : total iteration time 2.01 s error 0.0021257 - iteration 6 : total iteration time 1.91 s error 0.00075052 - iteration 7 : total iteration time 5.32 s error 0.00026229 - iteration 8 : total iteration time 2.46 s error 8.7579e-05 - iteration 9 : total iteration time 5.35 s error 2.9067e-05 - iteration 10 : total iteration time 2.32 s error 9.6345e-06 - iteration 11 : total iteration time 2.02 s error 2.9776e-06 - iteration 12 : total iteration time 2.35 s error 8.9895e-07 Start 1908: c_mpi_rep_example_refinement_lap_z_refine_gmres_sym 1661/3626 Test #1909: c_mpi_rep_example_refinement_lap_z_refine_bicgstab_sym ..................***Timeout 317.11 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1909: c_mpi_rep_example_refinement_lap_z_refine_bicgstab_sym 1661/3626 Test #1910: c_mpi_rep_example_simple_mixed_refine_cg ................................***Timeout 317.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.593038e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 1607873 Fill-in of L 39.664331 Time to compute symbol matrix 2.471459e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.931849e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 1607873 Fill-in 39.664331 Number of operations in full-rank: LDL^t 644.52 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.714741e-02 s Time for mapping/scheduling 5.259907e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.707644e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.123862e-03 s Time to initialize coeftab 1.226001e-01 s Start 1910: c_mpi_rep_example_simple_mixed_refine_cg 1661/3626 Test #1911: c_mpi_rep_example_simple_mixed_refine_gmres .............................***Timeout 317.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.628369e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 51109 Fill-in of L 7.452464 Time to compute symbol matrix 8.702762e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.748233e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 102218 Fill-in 14.904929 Number of operations in full-rank: LU 5.50 MFlops Prediction: Model AMD 6180 MKL Time to factorize 7.121319e-04 s Time for mapping/scheduling 1.363919e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.183518e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.311747e-02 s Time to initialize coeftab 3.076527e-01 s Time to factorize 8.730247e-01 s ( 6.30 MFlop/s) Number of operations 9.09 MFlops Number of static pivots 0 Memory usage of coeftab 150 Ko Time to solve 1.009215e+00 s - iteration 1 : total iteration time 1.34 s error 1.1494e-11 - iteration 2 : total iteration time 1.76 s error 8.7537e-16 Time for refinement 5.496925e+00 s || A ||_1 3.076897e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 3.076897e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.835671e-16 max(|| b_i - A x_i ||_1) 3.336461e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.612752e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.835671e-16 Start 1911: c_mpi_rep_example_simple_mixed_refine_gmres 1661/3626 Test #1913: c_mpi_rep_example_simple_mixed_lap_d_refine_cg_sym ......................***Timeout 316.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.313197e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.783999e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.525566e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.333015e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.317528e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.511979e-03 s Time to initialize coeftab 3.308694e-01 s Time to factorize 1.121003e+00 s ( 8.91 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Start 1913: c_mpi_rep_example_simple_mixed_lap_d_refine_cg_sym 1661/3626 Test #1914: c_mpi_rep_example_simple_mixed_lap_d_refine_gmres_sym ...................***Timeout 316.39 sec ischedInit: The thread number has been automatically set to 256 Start 1914: c_mpi_rep_example_simple_mixed_lap_d_refine_gmres_sym 1661/3626 Test #1915: c_mpi_rep_example_simple_mixed_lap_d_refine_bicgstab_sym ................***Timeout 316.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1915: c_mpi_rep_example_simple_mixed_lap_d_refine_bicgstab_sym 1661/3626 Test #1916: c_mpi_rep_example_simple_mixed_lap_z_refine_cg_her ......................***Timeout 316.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1916: c_mpi_rep_example_simple_mixed_lap_z_refine_cg_her 1661/3626 Test #1917: c_mpi_rep_example_simple_mixed_lap_z_refine_gmres_her ...................***Timeout 316.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1917: c_mpi_rep_example_simple_mixed_lap_z_refine_gmres_her 1661/3626 Test #1918: c_mpi_rep_example_simple_mixed_lap_z_refine_bicgstab_her ................***Timeout 315.99 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1918: c_mpi_rep_example_simple_mixed_lap_z_refine_bicgstab_her 1661/3626 Test #1921: c_mpi_rep_example_simple_mixed_lap_z_refine_bicgstab_sym ................***Timeout 315.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1921: c_mpi_rep_example_simple_mixed_lap_z_refine_bicgstab_sym 1661/3626 Test #1923: mpi_rep_example_simple_lap_s_facto1_sched0_1d ...........................***Timeout 315.23 sec Start 1923: mpi_rep_example_simple_lap_s_facto1_sched0_1d 1661/3626 Test #1925: mpi_rep_example_simple_lap_d_facto0_sched0_1d ...........................***Timeout 314.54 sec ischedInit: The thread number has been automatically set to 256 Start 1925: mpi_rep_example_simple_lap_d_facto0_sched0_1d 1661/3626 Test #1926: mpi_rep_example_simple_lap_d_facto1_sched0_1d ...........................***Timeout 314.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1926: mpi_rep_example_simple_lap_d_facto1_sched0_1d 1661/3626 Test #1927: mpi_rep_example_simple_lap_d_facto2_sched0_1d ...........................***Timeout 314.40 sec ischedInit: The thread number has been automatically set to 256 Start 1927: mpi_rep_example_simple_lap_d_facto2_sched0_1d 1661/3626 Test #1928: mpi_rep_example_simple_lap_c_facto0_sched0_1d ...........................***Timeout 314.20 sec Start 1928: mpi_rep_example_simple_lap_c_facto0_sched0_1d 1661/3626 Test #1929: mpi_rep_example_simple_lap_c_facto1_sched0_1d ...........................***Timeout 314.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1929: mpi_rep_example_simple_lap_c_facto1_sched0_1d 1661/3626 Test #1932: mpi_rep_example_simple_lap_c_facto4_sched0_1d ...........................***Timeout 312.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1932: mpi_rep_example_simple_lap_c_facto4_sched0_1d 1661/3626 Test #1936: mpi_rep_example_simple_lap_z_facto3_sched0_1d ...........................***Timeout 307.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1936: mpi_rep_example_simple_lap_z_facto3_sched0_1d 1661/3626 Test #1938: mpi_rep_example_simple_lap_s_facto0_sched1_1d ...........................***Timeout 306.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1938: mpi_rep_example_simple_lap_s_facto0_sched1_1d 1661/3626 Test #1940: mpi_rep_example_simple_lap_s_facto2_sched1_1d ...........................***Timeout 304.69 sec Start 1940: mpi_rep_example_simple_lap_s_facto2_sched1_1d 1661/3626 Test #1941: mpi_rep_example_simple_lap_d_facto0_sched1_1d ...........................***Timeout 304.58 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1941: mpi_rep_example_simple_lap_d_facto0_sched1_1d 1661/3626 Test #1942: mpi_rep_example_simple_lap_d_facto1_sched1_1d ...........................***Timeout 304.26 sec Start 1942: mpi_rep_example_simple_lap_d_facto1_sched1_1d 1661/3626 Test #1945: mpi_rep_example_simple_lap_c_facto1_sched1_1d ...........................***Timeout 303.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1945: mpi_rep_example_simple_lap_c_facto1_sched1_1d 1661/3626 Test #1946: mpi_rep_example_simple_lap_c_facto2_sched1_1d ...........................***Timeout 303.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1946: mpi_rep_example_simple_lap_c_facto2_sched1_1d 1661/3626 Test #1947: mpi_rep_example_simple_lap_c_facto3_sched1_1d ...........................***Timeout 301.78 sec Start 1947: mpi_rep_example_simple_lap_c_facto3_sched1_1d 1661/3626 Test #1949: mpi_rep_example_simple_lap_z_facto0_sched1_1d ...........................***Timeout 301.05 sec Start 1949: mpi_rep_example_simple_lap_z_facto0_sched1_1d Test #1545: shm_example_simple_lap_c_facto2_sched4_kway_tqrcpbegin ..................***Timeout 300.61 sec Start 1545: shm_example_simple_lap_c_facto2_sched4_kway_tqrcpbegin Test #1079: shm_example_simple_lap_c_facto4_sched1_kway_svdbegin ....................***Timeout 299.58 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 1662/3626 Test #1951: mpi_rep_example_simple_lap_z_facto2_sched1_1d ...........................***Timeout 299.35 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1951: mpi_rep_example_simple_lap_z_facto2_sched1_1d 1662/3626 Test #1953: mpi_rep_example_simple_lap_z_facto4_sched1_1d ...........................***Timeout 298.56 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.893246e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.159116e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.681987e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.377620e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.086123e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.499553e-02 s Time to initialize coeftab 3.943735e-01 s Time to factorize 1.281568e+00 s (16.63 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 1.436444e+00 s Start 1953: mpi_rep_example_simple_lap_z_facto4_sched1_1d 1662/3626 Test #1954: mpi_rep_example_simple_lap_s_facto0_sched4_1d ...........................***Timeout 297.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1954: mpi_rep_example_simple_lap_s_facto0_sched4_1d 1662/3626 Test #1955: mpi_rep_example_simple_lap_s_facto1_sched4_1d ...........................***Timeout 297.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1955: mpi_rep_example_simple_lap_s_facto1_sched4_1d 1662/3626 Test #1958: mpi_rep_example_simple_lap_d_facto1_sched4_1d ...........................***Timeout 297.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1958: mpi_rep_example_simple_lap_d_facto1_sched4_1d 1662/3626 Test #1960: mpi_rep_example_simple_lap_c_facto0_sched4_1d ...........................***Timeout 296.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.513535e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.448626e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.163613e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.787905e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.414053e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.457599e-01 s Start 1960: mpi_rep_example_simple_lap_c_facto0_sched4_1d Test #1549: shm_example_simple_lap_c_facto2_sched4_not_rqrrtbegin ...................***Timeout 296.70 sec Start 1549: shm_example_simple_lap_c_facto2_sched4_not_rqrrtbegin 1662/3626 Test #1962: mpi_rep_example_simple_lap_c_facto2_sched4_1d ...........................***Timeout 296.36 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1962: mpi_rep_example_simple_lap_c_facto2_sched4_1d 1662/3626 Test #1963: mpi_rep_example_simple_lap_c_facto3_sched4_1d ...........................***Timeout 296.11 sec Start 1963: mpi_rep_example_simple_lap_c_facto3_sched4_1d 1662/3626 Test #1965: mpi_rep_example_simple_lap_z_facto0_sched4_1d ...........................***Timeout 295.87 sec ischedInit: The thread number has been automatically set to 256 Start 1965: mpi_rep_example_simple_lap_z_facto0_sched4_1d 1662/3626 Test #1966: mpi_rep_example_simple_lap_z_facto1_sched4_1d ...........................***Timeout 295.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1966: mpi_rep_example_simple_lap_z_facto1_sched4_1d 1662/3626 Test #1967: mpi_rep_example_simple_lap_z_facto2_sched4_1d ...........................***Timeout 294.98 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1967: mpi_rep_example_simple_lap_z_facto2_sched4_1d 1662/3626 Test #1968: mpi_rep_example_simple_lap_z_facto3_sched4_1d ...........................***Timeout 294.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1968: mpi_rep_example_simple_lap_z_facto3_sched4_1d 1662/3626 Test #1969: mpi_rep_example_simple_lap_z_facto4_sched4_1d ...........................***Timeout 294.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.883893e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.568909e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.043354e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.310356e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.244077e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.173336e-03 s Time to initialize coeftab 1.021025e-01 s Start 1969: mpi_rep_example_simple_lap_z_facto4_sched4_1d 1662/3626 Test #1970: mpi_dst_example_simple_lap_s_facto0_sched0_1d ...........................***Timeout 294.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1970: mpi_dst_example_simple_lap_s_facto0_sched0_1d 1662/3626 Test #1972: mpi_dst_example_simple_lap_s_facto2_sched0_1d ...........................***Timeout 294.73 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1972: mpi_dst_example_simple_lap_s_facto2_sched0_1d 1662/3626 Test #1973: mpi_dst_example_simple_lap_d_facto0_sched0_1d ...........................***Timeout 294.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1973: mpi_dst_example_simple_lap_d_facto0_sched0_1d 1662/3626 Test #1977: mpi_dst_example_simple_lap_c_facto1_sched0_1d ...........................***Timeout 293.64 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.603800e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.158124e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.010935e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.652645e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.429088e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.387353e+00 s Time to initialize coeftab 3.850144e-01 s Start 1977: mpi_dst_example_simple_lap_c_facto1_sched0_1d 1662/3626 Test #1978: mpi_dst_example_simple_lap_c_facto2_sched0_1d ...........................***Timeout 293.40 sec Start 1978: mpi_dst_example_simple_lap_c_facto2_sched0_1d 1662/3626 Test #1984: mpi_dst_example_simple_lap_z_facto3_sched0_1d ...........................***Timeout 290.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1984: mpi_dst_example_simple_lap_z_facto3_sched0_1d 1662/3626 Test #1985: mpi_dst_example_simple_lap_z_facto4_sched0_1d ...........................***Timeout 290.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1985: mpi_dst_example_simple_lap_z_facto4_sched0_1d 1662/3626 Test #1987: mpi_dst_example_simple_lap_s_facto1_sched1_1d ...........................***Timeout 289.83 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1987: mpi_dst_example_simple_lap_s_facto1_sched1_1d 1662/3626 Test #1988: mpi_dst_example_simple_lap_s_facto2_sched1_1d ...........................***Timeout 288.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1988: mpi_dst_example_simple_lap_s_facto2_sched1_1d 1662/3626 Test #1989: mpi_dst_example_simple_lap_d_facto0_sched1_1d ...........................***Timeout 287.11 sec Start 1989: mpi_dst_example_simple_lap_d_facto0_sched1_1d 1662/3626 Test #1990: mpi_dst_example_simple_lap_d_facto1_sched1_1d ...........................***Timeout 286.92 sec Start 1990: mpi_dst_example_simple_lap_d_facto1_sched1_1d 1662/3626 Test #1994: mpi_dst_example_simple_lap_c_facto2_sched1_1d ...........................***Timeout 283.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1994: mpi_dst_example_simple_lap_c_facto2_sched1_1d Test #1580: shm_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpend .........***Timeout 283.70 sec Start 1580: shm_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpend 1662/3626 Test #1995: mpi_dst_example_simple_lap_c_facto3_sched1_1d ...........................***Timeout 283.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.652280e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.219596e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.253824e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.151519e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.619911e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.057139e+00 s Time to initialize coeftab 9.397691e-01 s Start 1995: mpi_dst_example_simple_lap_c_facto3_sched1_1d 1662/3626 Test #1996: mpi_dst_example_simple_lap_c_facto4_sched1_1d ...........................***Timeout 283.07 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.122396e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.021597e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.338831e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 1996: mpi_dst_example_simple_lap_c_facto4_sched1_1d 1662/3626 Test #1997: mpi_dst_example_simple_lap_z_facto0_sched1_1d ...........................***Timeout 283.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1997: mpi_dst_example_simple_lap_z_facto0_sched1_1d 1662/3626 Test #1998: mpi_dst_example_simple_lap_z_facto1_sched1_1d ...........................***Timeout 282.75 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.847458e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.792207e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.566608e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.410103e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.033852e+01 s Start 1998: mpi_dst_example_simple_lap_z_facto1_sched1_1d 1662/3626 Test #1999: mpi_dst_example_simple_lap_z_facto2_sched1_1d ...........................***Timeout 282.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1999: mpi_dst_example_simple_lap_z_facto2_sched1_1d 1662/3626 Test #2000: mpi_dst_example_simple_lap_z_facto3_sched1_1d ...........................***Timeout 282.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 2000: mpi_dst_example_simple_lap_z_facto3_sched1_1d Test #1582: shm_example_simple_lap_c_facto3_sched4_not_rqrrtend .....................***Timeout 281.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1582: shm_example_simple_lap_c_facto3_sched4_not_rqrrtend 1662/3626 Test #2001: mpi_dst_example_simple_lap_z_facto4_sched1_1d ...........................***Timeout 280.78 sec Start 2001: mpi_dst_example_simple_lap_z_facto4_sched1_1d 1662/3626 Test #2002: mpi_dst_example_simple_lap_s_facto0_sched4_1d ...........................***Timeout 280.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2002: mpi_dst_example_simple_lap_s_facto0_sched4_1d Test #1585: shm_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtbegin .......***Timeout 280.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1585: shm_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtbegin Test #1589: shm_example_simple_lap_c_facto4_sched4_not_svdbegin .....................***Timeout 279.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 1589: shm_example_simple_lap_c_facto4_sched4_not_svdbegin Test #1594: shm_example_simple_lap_c_facto4_sched4_kwayprojections_svdend ...........***Timeout 275.59 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.386878e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.040041e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.156597e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.140710e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.879828e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.252316e-02 s Time to initialize coeftab 2.646677e-02 s Time to factorize 1.780700e+00 s (11.97 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 3.995399e-01 s Time for refinement 1.515818e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.079546e-07 max(|| b_i - A x_i ||_1) 8.882546e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.241332e+00 (SUCCESS) Start 1594: shm_example_simple_lap_c_facto4_sched4_kwayprojections_svdend 1662/3626 Test #2005: mpi_dst_example_simple_lap_d_facto0_sched4_1d ...........................***Timeout 275.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.227886e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.460626e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.592367e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.560400e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.261859e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.516795e+00 s Time to initialize coeftab 2.649991e-01 s Time to factorize 3.562294e+00 s ( 1.42 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.687540e+00 s Time for refinement 1.720572e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.080309e-16 max(|| b_i - A x_i ||_1) 1.776130e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.231860e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.080309e-16 max(|| b_i - A x_i ||_1) 1.776130e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.231860e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.080309e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.080309e-16 max(|| b_i - A x_i ||_1) 1.776130e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.231860e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.776130e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.231860e-03 (SUCCESS) Start 2005: mpi_dst_example_simple_lap_d_facto0_sched4_1d 1662/3626 Test #2007: mpi_dst_example_simple_lap_d_facto2_sched4_1d ...........................***Timeout 273.78 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.350314e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.007954e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.030376e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.665142e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.037638e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.721417e-01 s Time to initialize coeftab 1.455083e-01 s Time to factorize 1.404337e+00 s ( 7.11 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 1.393434e+00 s Time for refinement 1.238179e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.950504e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.950504e-16 max(|| b_i - A x_i ||_1) 1.644142e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.066006e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.644142e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.066006e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.950504e-16 max(|| b_i - A x_i ||_1) 1.644142e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.066006e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.950504e-16 max(|| b_i - A x_i ||_1) 1.644142e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.066006e-03 (SUCCESS) Start 2007: mpi_dst_example_simple_lap_d_facto2_sched4_1d 1662/3626 Test #2008: mpi_dst_example_simple_lap_c_facto0_sched4_1d ...........................***Timeout 272.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.152705e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.043373e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.574830e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.167615e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.082989e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.023408e-01 s Time to initialize coeftab 1.372031e-01 s Time to factorize 4.146786e+00 s ( 4.89 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.475073e+00 s Time for refinement 3.023714e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.909195e-07 max(|| b_i - A x_i ||_1) 8.615417e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.173969e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.909195e-07 max(|| b_i - A x_i ||_1) 8.615417e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.173969e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.909195e-07 max(|| b_i - A x_i ||_1) 8.615417e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.173969e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.909195e-07 max(|| b_i - A x_i ||_1) 8.615417e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.173969e+00 (SUCCESS) Start 2008: mpi_dst_example_simple_lap_c_facto0_sched4_1d 1662/3626 Test #2009: mpi_dst_example_simple_lap_c_facto1_sched4_1d ...........................***Timeout 271.84 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.558721e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.617086e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.309961e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.301027e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.236997e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.484328e+00 s Time to initialize coeftab 3.413432e-01 s Time to factorize 1.757953e+00 s (12.12 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.628023e+00 s Time for refinement 1.363773e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.811150e-07 max(|| b_i - A x_i ||_1) 7.989589e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.016052e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.811150e-07 max(|| b_i - A x_i ||_1) 7.989589e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.016052e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.811150e-07 max(|| b_i - A x_i ||_1) 7.989589e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.016052e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.811150e-07 max(|| b_i - A x_i ||_1) 7.989589e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.016052e+00 (SUCCESS) Start 2009: mpi_dst_example_simple_lap_c_facto1_sched4_1d 1662/3626 Test #2010: mpi_dst_example_simple_lap_c_facto2_sched4_1d ...........................***Timeout 268.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.426105e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.310627e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.865218e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.284888e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.436079e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.070553e-01 s Time to initialize coeftab 8.732537e-02 s Time to factorize 1.135793e+00 s (35.19 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 7.318150e-01 s Time for refinement 5.394636e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.730714e-07 max(|| b_i - A x_i ||_1) 7.539204e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.902403e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.730714e-07 max(|| b_i - A x_i ||_1) 7.539204e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.902403e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.730714e-07 max(|| b_i - A x_i ||_1) 7.539204e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.902403e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.730714e-07 max(|| b_i - A x_i ||_1) 7.539204e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.902403e+00 (SUCCESS) Start 2010: mpi_dst_example_simple_lap_c_facto2_sched4_1d 1662/3626 Test #2012: mpi_dst_example_simple_lap_c_facto4_sched4_1d ...........................***Timeout 268.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.892641e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.915211e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.804420e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.979893e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.739998e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.873520e-01 s Time to initialize coeftab 8.267832e-02 s Time to factorize 2.066346e+00 s (10.31 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.208002e+00 s Time for refinement 9.514689e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.816236e-07 max(|| b_i - A x_i ||_1) 7.916370e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.997576e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.816236e-07 max(|| b_i - A x_i ||_1) 7.916370e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.997576e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.816236e-07 max(|| b_i - A x_i ||_1) 7.916370e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.997576e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.816236e-07 max(|| b_i - A x_i ||_1) 7.916370e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.997576e+00 (SUCCESS) Start 2012: mpi_dst_example_simple_lap_c_facto4_sched4_1d 1662/3626 Test #2014: mpi_dst_example_simple_lap_z_facto1_sched4_1d ...........................***Timeout 267.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.264164e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.906054e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.789284e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.596799e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.101309e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.574502e-01 s Time to initialize coeftab 8.401204e-02 s Time to factorize 1.549016e+00 s (13.76 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 9.858599e-01 s Time for refinement 9.991354e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.991895e-16 max(|| b_i - A x_i ||_1) 1.730549e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.366763e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.991895e-16 max(|| b_i - A x_i ||_1) 1.730549e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.366763e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.991895e-16 max(|| b_i - A x_i ||_1) 1.730549e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.366763e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.991895e-16 max(|| b_i - A x_i ||_1) 1.730549e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.366763e-03 (SUCCESS) Start 2014: mpi_dst_example_simple_lap_z_facto1_sched4_1d 1662/3626 Test #2019: mpi_dst_example_simple_lap_s_facto0_sched0_not_svdend ...................***Timeout 263.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.418911e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.018300e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.371373e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.493848e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.232569e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.483924e-01 s Time to initialize coeftab 7.399714e-02 s Time to factorize 3.546271e+00 s ( 1.43 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.339060e-01 s Time for refinement 2.842371e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.913705e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.913705e-07 max(|| b_i - A x_i ||_1) 8.598442e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.080472e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.598442e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.080472e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.913705e-07 max(|| b_i - A x_i ||_1) 8.598442e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.080472e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.913705e-07 max(|| b_i - A x_i ||_1) 8.598442e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.080472e+00 (SUCCESS) Start 2019: mpi_dst_example_simple_lap_s_facto0_sched0_not_svdend 1662/3626 Test #2020: mpi_dst_example_simple_lap_s_facto0_sched0_kway_svdbegin ................***Timeout 262.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.037539e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.925214e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.016179e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.253909e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.190852e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.001606e-02 s Time to initialize coeftab 4.276626e-02 s Start 2020: mpi_dst_example_simple_lap_s_facto0_sched0_kway_svdbegin Test #1636: shm_example_simple_lap_z_facto0_sched4_kway_rqrcpend ....................***Timeout 256.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.810566e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.000411e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.698243e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.208293e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.404628e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.121363e-02 s Time to initialize coeftab 8.511124e-02 s Time to factorize 1.093688e+00 s (18.54 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 5.050314e-01 s Time for refinement 5.277594e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.020755e-16 max(|| b_i - A x_i ||_1) 2.037514e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.141339e-03 (SUCCESS) Start 1636: shm_example_simple_lap_z_facto0_sched4_kway_rqrcpend 1662/3626 Test #2022: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_svdbegin .....***Timeout 253.31 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.145097e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.157889e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.231278e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.272585e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.700090e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.448289e+00 s Time to initialize coeftab 4.879985e-01 s Time to factorize 1.607401e+01 s (322.50 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 4.195908e-01 s Time for refinement 4.802553e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.048760e-07 max(|| b_i - A x_i ||_1) 9.459670e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.188693e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.048760e-07 max(|| b_i - A x_i ||_1) 9.459670e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.188693e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.048760e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.048760e-07 max(|| b_i - A x_i ||_1) 9.459670e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.188693e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.459670e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.188693e+00 (SUCCESS) Start 2022: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_svdbegin 1662/3626 Test #2024: mpi_dst_example_simple_lap_s_facto0_sched0_not_pqrcpbegin ...............***Timeout 250.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.770441e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.458876e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.310895e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.085911e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.867815e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.776181e-01 s Time to initialize coeftab 9.838890e-02 s Time to factorize 2.951148e+00 s ( 1.72 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.084011e-01 s - iteration 1 : total iteration time 0.0911 s error 5.52e-11 Time for refinement 1.714545e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022811e-08 max(|| b_i - A x_i ||_1) 2.924001e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674272e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022811e-08 max(|| b_i - A x_i ||_1) 2.924001e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674272e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022811e-08 max(|| b_i - A x_i ||_1) 2.924001e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674272e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022811e-08 max(|| b_i - A x_i ||_1) 2.924001e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674272e-01 (SUCCESS) Start 2024: mpi_dst_example_simple_lap_s_facto0_sched0_not_pqrcpbegin 1662/3626 Test #2025: mpi_dst_example_simple_lap_s_facto0_sched0_not_pqrcpend .................***Timeout 249.32 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.177339e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.421614e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.171794e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.352327e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.972268e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.909918e-01 s Time to initialize coeftab 8.755889e-02 s Time to factorize 6.173611e-01 s ( 8.20 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.261476e-02 s Time for refinement 1.291393e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098116e-07 max(|| b_i - A x_i ||_1) 9.590242e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205101e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098116e-07 max(|| b_i - A x_i ||_1) 9.590242e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205101e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098116e-07 max(|| b_i - A x_i ||_1) 9.590242e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205101e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098116e-07 max(|| b_i - A x_i ||_1) 9.590242e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205101e+00 (SUCCESS) Start 2025: mpi_dst_example_simple_lap_s_facto0_sched0_not_pqrcpend 1662/3626 Test #2026: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpbegin ..............***Timeout 249.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.954273e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.256957e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.847391e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.068273e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.113739e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.154810e-01 s Time to initialize coeftab 1.268716e-01 s Time to factorize 1.537582e+00 s ( 3.29 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.570114e-02 s - iteration 1 : total iteration time 0.0111 s error 5.5154e-11 Time for refinement 7.190600e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.040999e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.040999e-08 max(|| b_i - A x_i ||_1) 2.933775e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.686553e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.040999e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.040999e-08 max(|| b_i - A x_i ||_1) 2.933775e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.686553e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.933775e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.686553e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.933775e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.686553e-01 (SUCCESS) Start 2026: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpbegin 1662/3626 Test #2027: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpend ................***Timeout 249.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.743071e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.701416e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.262224e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.696065e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.969858e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.501475e-01 s Time to initialize coeftab 6.172949e-02 s Time to factorize 1.210001e+00 s ( 4.18 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.165097e-02 s Time for refinement 1.861710e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098116e-07 max(|| b_i - A x_i ||_1) 9.590242e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205101e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098116e-07 max(|| b_i - A x_i ||_1) 9.590242e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205101e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098116e-07 max(|| b_i - A x_i ||_1) 9.590242e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205101e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098116e-07 max(|| b_i - A x_i ||_1) 9.590242e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205101e+00 (SUCCESS) Start 2027: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpend Test #1646: shm_example_simple_lap_z_facto0_sched4_not_rqrrtend .....................***Timeout 248.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.721812e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.741228e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.749658e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.531763e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.119655e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.436765e-02 s Time to initialize coeftab 3.900702e-02 s Time to factorize 5.041591e-01 s (40.23 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 4.241603e-01 s Time for refinement 1.430683e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.982061e-16 max(|| b_i - A x_i ||_1) 2.013345e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.080352e-03 (SUCCESS) Start 1646: shm_example_simple_lap_z_facto0_sched4_not_rqrrtend 1662/3626 Test #2028: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_pqrcpbegin ...***Timeout 248.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.630615e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.188753e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.656403e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.283201e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.806196e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.782726e+00 s Time to initialize coeftab 3.865824e-01 s Time to factorize 6.605438e+00 s (784.78 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.193432e-02 s - iteration 1 : total iteration time 0.0269 s error 5.52e-11 Time for refinement 8.379552e-02 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022811e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022811e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022811e-08 max(|| b_i - A x_i ||_1) 2.924001e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674272e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022811e-08 max(|| b_i - A x_i ||_1) 2.924001e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674272e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.924001e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674272e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.924001e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674272e-01 (SUCCESS) Start 2028: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_pqrcpbegin 1662/3626 Test #2030: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrcpbegin ...............***Timeout 245.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.731830e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.923273e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.453876e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.217844e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.276498e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.195080e-01 s Time to initialize coeftab 1.540996e-01 s Time to factorize 3.277713e+00 s ( 1.54 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 5.360591e-01 s - iteration 1 : total iteration time 0.369 s error 5.5402e-11 Time for refinement 1.147261e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.032994e-08 max(|| b_i - A x_i ||_1) 2.912868e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.660282e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.032994e-08 max(|| b_i - A x_i ||_1) 2.912868e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.660282e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.032994e-08 max(|| b_i - A x_i ||_1) 2.912868e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.660282e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.032994e-08 max(|| b_i - A x_i ||_1) 2.912868e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.660282e-01 (SUCCESS) Start 2030: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrcpbegin 1662/3626 Test #2032: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrcpbegin ..............***Timeout 245.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.575229e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.122577e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.547633e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.309663e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.281129e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.044104e-01 s Time to initialize coeftab 2.046099e-01 s Time to factorize 7.052489e+00 s (735.03 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 2.894239e-01 s - iteration 1 : total iteration time 0.468 s error 5.5277e-11 Time for refinement 1.169194e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) Start 2032: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrcpbegin 1662/3626 Test #2033: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrcpend ................***Timeout 244.79 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.975993e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.819316e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.362471e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.804401e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.615862e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.940466e-01 s Time to initialize coeftab 1.622769e-01 s Time to factorize 4.619481e+00 s ( 1.10 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.189838e-01 s Time for refinement 3.182902e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106819e-07 max(|| b_i - A x_i ||_1) 9.609415e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.207510e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106819e-07 max(|| b_i - A x_i ||_1) 9.609415e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.207510e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106819e-07 max(|| b_i - A x_i ||_1) 9.609415e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.207510e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106819e-07 max(|| b_i - A x_i ||_1) 9.609415e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.207510e+00 (SUCCESS) Start 2033: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrcpend Test #1165: shm_example_simple_lap_z_facto1_sched1_not_rqrrtbegin ...................***Timeout 235.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.580461e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.662468e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.926399e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.524887e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.735422e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.364839e-01 s Time to initialize coeftab 7.521983e-01 s Time to factorize 5.263298e+00 s ( 4.05 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 4.498664e-01 s - iteration 1 : total iteration time 0.971 s error 5.2588e-13 Time for refinement 1.822409e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.258773e-13 max(|| b_i - A x_i ||_1) 7.875633e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.987289e+00 (SUCCESS) Test #1656: shm_example_simple_lap_z_facto1_sched4_kway_svdend ......................***Timeout 235.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.226356e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.165448e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.009525e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.444088e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.992411e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.602362e-01 s Time to initialize coeftab 8.981325e-02 s Time to factorize 3.688774e+00 s ( 5.78 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 6.078325e-01 s Time for refinement 4.594984e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.747183e-16 max(|| b_i - A x_i ||_1) 1.845091e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.655790e-03 (SUCCESS) Start 1656: shm_example_simple_lap_z_facto1_sched4_kway_svdend 1663/3626 Test #2034: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpbegin ...***Timeout 231.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.147867e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.749087e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.439275e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.287994e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.242876e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.132650e-01 s Time to initialize coeftab 5.027081e-01 s Time to factorize 1.444955e+01 s (358.75 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 1.757165e-01 s - iteration 1 : total iteration time 0.313 s error 5.5277e-11 Time for refinement 6.308035e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) Start 2034: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpbegin 1663/3626 Test #2035: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpend .....***Timeout 231.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.032676e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.180457e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.090912e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.838265e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.316933e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.701898e-01 s Time to initialize coeftab 2.540136e-01 s Time to factorize 4.406815e+00 s ( 1.15 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.340009e-01 s Time for refinement 3.779976e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106819e-07 max(|| b_i - A x_i ||_1) 9.609415e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.207510e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106819e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106819e-07 max(|| b_i - A x_i ||_1) 9.609415e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.207510e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.609415e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.207510e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106819e-07 max(|| b_i - A x_i ||_1) 9.609415e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.207510e+00 (SUCCESS) Start 2035: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpend 1663/3626 Test #2036: mpi_dst_example_simple_lap_s_facto0_sched0_not_tqrcpbegin ...............***Timeout 231.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.387462e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.703722e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.776111e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.106465e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.182337e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.626175e-01 s Time to initialize coeftab 2.911628e-01 s Time to factorize 1.382654e+01 s (374.92 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 1.542871e-02 s - iteration 1 : total iteration time 0.00712 s error 5.556e-11 Time for refinement 2.146973e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.053065e-08 max(|| b_i - A x_i ||_1) 2.921346e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670936e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.053065e-08 max(|| b_i - A x_i ||_1) 2.921346e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670936e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.053065e-08 max(|| b_i - A x_i ||_1) 2.921346e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670936e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.053065e-08 max(|| b_i - A x_i ||_1) 2.921346e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670936e-01 (SUCCESS) Start 2036: mpi_dst_example_simple_lap_s_facto0_sched0_not_tqrcpbegin Test #1178: shm_example_simple_lap_z_facto2_sched1_kwayprojections_svdend ...........***Timeout 231.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.861166e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.501407e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.152340e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 3.375021e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.326473e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.140245e-01 s Time to initialize coeftab 1.169685e-01 s Time to factorize 2.118990e+00 s (18.86 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 14.1 Ko Outside 16.9 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 1.49 Mo / 1.49 Mo ------------------------------------------------ Total 2.01 Mo / 2.01 Mo Time to solve 1.396289e-01 s Time for refinement 7.827480e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.743873e-16 max(|| b_i - A x_i ||_1) 1.802238e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.547658e-03 (SUCCESS) Test #1657: shm_example_simple_lap_z_facto1_sched4_kwayprojections_svdbegin .........***Timeout 231.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.353652e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.343024e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.854726e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.092007e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.207113e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.215530e-01 s Time to initialize coeftab 1.202155e+00 s Time to factorize 6.238084e+00 s ( 3.42 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 6.192197e-01 s - iteration 1 : total iteration time 0.633 s error 1.1333e-14 Time for refinement 1.250116e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.133343e-14 max(|| b_i - A x_i ||_1) 2.158947e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.447756e-02 (SUCCESS) Start 1657: shm_example_simple_lap_z_facto1_sched4_kwayprojections_svdbegin Test #1658: shm_example_simple_lap_z_facto1_sched4_kwayprojections_svdend ...........***Timeout 230.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.460672e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.543647e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.197777e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.254684e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.317415e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.160637e-01 s Time to initialize coeftab 8.192285e-02 s Time to factorize 1.029037e+00 s (20.71 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 8.821306e-01 s Time for refinement 8.579848e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.766898e-16 max(|| b_i - A x_i ||_1) 1.842378e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.648944e-03 (SUCCESS) Start 1658: shm_example_simple_lap_z_facto1_sched4_kwayprojections_svdend Test #1659: shm_example_simple_lap_z_facto1_sched4_not_pqrcpbegin ...................***Timeout 230.89 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.177892e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.490348e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.108313e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.514534e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.324536e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.354084e-01 s Time to initialize coeftab 8.478455e-01 s Time to factorize 4.604752e+00 s ( 4.63 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 9.010356e-01 s - iteration 1 : total iteration time 1.39 s error 1.0278e-14 Time for refinement 2.425327e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.027655e-14 max(|| b_i - A x_i ||_1) 1.855652e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.682439e-02 (SUCCESS) Start 1659: shm_example_simple_lap_z_facto1_sched4_not_pqrcpbegin Test #1660: shm_example_simple_lap_z_facto1_sched4_not_pqrcpend .....................***Timeout 230.83 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.359432e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.024765e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.363670e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.436654e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.320934e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.119260e-01 s Time to initialize coeftab 1.729388e-01 s Time to factorize 1.894085e+00 s (11.25 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 9.167878e-01 s Time for refinement 5.108194e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.834495e-16 max(|| b_i - A x_i ||_1) 1.848042e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.663236e-03 (SUCCESS) Start 1660: shm_example_simple_lap_z_facto1_sched4_not_pqrcpend Test #1661: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpbegin ..................***Timeout 230.78 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.955138e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.051189e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.324730e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.448884e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.481948e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.609338e-02 s Time to initialize coeftab 9.653468e-02 s Time to factorize 1.639543e+00 s (13.00 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 8.399874e-01 s - iteration 1 : total iteration time 0.664 s error 1.1855e-14 Time for refinement 1.814156e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.185544e-14 max(|| b_i - A x_i ||_1) 2.018900e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.094369e-02 (SUCCESS) Start 1661: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpbegin 1664/3626 Test #2038: mpi_dst_example_simple_lap_s_facto0_sched0_kway_tqrcpbegin ..............***Timeout 230.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.461832e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.906498e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.914218e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.670707e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.394733e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.291686e+00 s Time to initialize coeftab 4.069853e-01 s Time to factorize 6.134088e+00 s (845.08 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 6.055864e-01 s - iteration 1 : total iteration time 0.738 s error 5.5331e-11 Time for refinement 1.906058e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.039892e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.039892e-08 max(|| b_i - A x_i ||_1) 2.914809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.662721e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.914809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.662721e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.039892e-08 max(|| b_i - A x_i ||_1) 2.914809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.662721e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.039892e-08 max(|| b_i - A x_i ||_1) 2.914809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.662721e-01 (SUCCESS) Start 2038: mpi_dst_example_simple_lap_s_facto0_sched0_kway_tqrcpbegin 1664/3626 Test #2039: mpi_dst_example_simple_lap_s_facto0_sched0_kway_tqrcpend ................***Timeout 230.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.937580e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.119853e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.392272e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.702456e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.451397e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.934055e-01 s Time to initialize coeftab 1.599649e-01 s Time to factorize 2.552748e+00 s ( 1.98 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.167507e-01 s Time for refinement 1.658945e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110107e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110107e-07 max(|| b_i - A x_i ||_1) 9.598076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206085e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110107e-07 max(|| b_i - A x_i ||_1) 9.598076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206085e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110107e-07 max(|| b_i - A x_i ||_1) 9.598076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206085e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.598076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206085e+00 (SUCCESS) Start 2039: mpi_dst_example_simple_lap_s_facto0_sched0_kway_tqrcpend 1664/3626 Test #2040: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpbegin ...***Timeout 230.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.995996e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.426113e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.099565e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.040863e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.774745e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.208784e-01 s Time to initialize coeftab 2.415107e-01 s Time to factorize 1.262356e+00 s ( 4.01 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 1.743825e-02 s - iteration 1 : total iteration time 0.0436 s error 5.5597e-11 Time for refinement 9.682509e-02 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.037595e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.037595e-08 max(|| b_i - A x_i ||_1) 2.915165e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.663169e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.915165e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.663169e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.037595e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.037595e-08 max(|| b_i - A x_i ||_1) 2.915165e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.663169e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.915165e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.663169e-01 (SUCCESS) Start 2040: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpbegin 1664/3626 Test #2041: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpend .....***Timeout 230.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.239439e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.176390e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.114549e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.169802e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.939810e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.700384e-01 s Time to initialize coeftab 7.150722e-01 s Time to factorize 4.489069e+00 s ( 1.13 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.985771e-01 s Time for refinement 4.227133e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110107e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110107e-07 max(|| b_i - A x_i ||_1) 9.598076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206085e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110107e-07 max(|| b_i - A x_i ||_1) 9.598076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206085e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110107e-07 max(|| b_i - A x_i ||_1) 9.598076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206085e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.598076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206085e+00 (SUCCESS) Start 2041: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpend 1664/3626 Test #2043: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrrtend .................***Timeout 230.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.911967e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.320106e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.659593e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.392536e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.618402e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.414302e-01 s Time to initialize coeftab 1.128861e-01 s Time to factorize 4.108375e-01 s (12.32 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 4.638813e-02 s Time for refinement 2.876061e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041276e-07 max(|| b_i - A x_i ||_1) 9.559704e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201263e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041276e-07 max(|| b_i - A x_i ||_1) 9.559704e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201263e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041276e-07 max(|| b_i - A x_i ||_1) 9.559704e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201263e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041276e-07 max(|| b_i - A x_i ||_1) 9.559704e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201263e+00 (SUCCESS) Start 2043: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrrtend 1664/3626 Test #2046: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtbegin ...***Timeout 230.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.737571e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.883884e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.275127e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.697811e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.059699e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.720500e-01 s Time to initialize coeftab 5.467524e-01 s Time to factorize 8.654958e+00 s (598.94 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.496870e-01 s - iteration 1 : total iteration time 0.09 s error 5.5928e-11 Time for refinement 2.088045e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) Start 2046: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtbegin 1664/3626 Test #2047: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtend .....***Timeout 230.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.982372e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.670703e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.719296e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.561134e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.980354e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.814681e-01 s Time to initialize coeftab 1.632728e-01 s Time to factorize 2.082570e+00 s ( 2.43 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.698981e-01 s Time for refinement 6.865980e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041468e-07 max(|| b_i - A x_i ||_1) 9.564177e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201825e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041468e-07 max(|| b_i - A x_i ||_1) 9.564177e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201825e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041468e-07 max(|| b_i - A x_i ||_1) 9.564177e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201825e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041468e-07 max(|| b_i - A x_i ||_1) 9.564177e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201825e+00 (SUCCESS) Start 2047: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtend 1664/3626 Test #2048: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpilu0 ...............***Timeout 230.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.117989e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.503627e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.236817e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.238526e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.600731e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.123696e-01 s Time to initialize coeftab 6.686512e-02 s Time to factorize 7.220485e-01 s ( 7.01 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 6.952887e-02 s - iteration 1 : total iteration time 0.0393 s error 3.0959e-11 Time for refinement 1.481187e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.058674e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.058674e-08 max(|| b_i - A x_i ||_1) 2.928500e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679925e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.058674e-08 max(|| b_i - A x_i ||_1) 2.928500e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679925e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.928500e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679925e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.058674e-08 max(|| b_i - A x_i ||_1) 2.928500e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679925e-01 (SUCCESS) Start 2048: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpilu0 1664/3626 Test #2049: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpilu1 ...............***Timeout 230.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.496616e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.147043e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.183860e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.762317e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.476776e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.822125e-01 s Time to initialize coeftab 3.108509e-01 s Time to factorize 7.173730e+00 s (722.61 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.850077e-01 s - iteration 1 : total iteration time 0.803 s error 3.0929e-11 Time for refinement 1.574305e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.061699e-08 max(|| b_i - A x_i ||_1) 2.932194e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.684566e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.061699e-08 max(|| b_i - A x_i ||_1) 2.932194e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.684566e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.061699e-08 max(|| b_i - A x_i ||_1) 2.932194e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.684566e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.061699e-08 max(|| b_i - A x_i ||_1) 2.932194e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.684566e-01 (SUCCESS) Start 2049: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpilu1 1664/3626 Test #2050: mpi_dst_example_simple_lap_s_facto1_sched0_not_svdbegin .................***Timeout 230.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.112770e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.762662e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.330248e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.735781e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.482989e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.041087e-01 s Time to initialize coeftab 5.748091e-01 s Time to factorize 8.869784e+00 s (604.19 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 4.641293e-01 s Time for refinement 8.225464e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.898319e-07 max(|| b_i - A x_i ||_1) 8.416911e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.057661e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.898319e-07 max(|| b_i - A x_i ||_1) 8.416911e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.057661e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.898319e-07 max(|| b_i - A x_i ||_1) 8.416911e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.057661e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.898319e-07 max(|| b_i - A x_i ||_1) 8.416911e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.057661e+00 (SUCCESS) Start 2050: mpi_dst_example_simple_lap_s_facto1_sched0_not_svdbegin Test #1665: shm_example_simple_lap_z_facto1_sched4_not_rqrcpbegin ...................***Timeout 230.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.892964e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.476854e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.372758e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.714783e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.217087e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.419274e-02 s Time to initialize coeftab 2.120001e-01 s Time to factorize 1.311436e+00 s (16.25 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 3.520412e-01 s - iteration 1 : total iteration time 0.371 s error 7.9714e-15 Time for refinement 8.074508e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.971753e-15 max(|| b_i - A x_i ||_1) 1.263391e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.187964e-02 (SUCCESS) Start 1665: shm_example_simple_lap_z_facto1_sched4_not_rqrcpbegin 1664/3626 Test #2051: mpi_dst_example_simple_lap_s_facto1_sched0_not_svdend ...................***Timeout 229.76 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.814683e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.235302e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.769152e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.875604e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.486961e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.562729e-01 s Time to initialize coeftab 1.760462e-01 s Time to factorize 2.673330e+00 s ( 1.96 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.563651e-01 s Time for refinement 3.145635e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.687257e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.687257e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.687257e-07 max(|| b_i - A x_i ||_1) 7.436542e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.344687e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.436542e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.344687e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.436542e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.344687e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.687257e-07 max(|| b_i - A x_i ||_1) 7.436542e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.344687e-01 (SUCCESS) Start 2051: mpi_dst_example_simple_lap_s_facto1_sched0_not_svdend 1664/3626 Test #2052: mpi_dst_example_simple_lap_s_facto1_sched0_kway_svdbegin ................***Timeout 229.74 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.871403e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.619219e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.305546e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.324219e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.747229e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.227336e-01 s Time to initialize coeftab 4.006461e-01 s Time to factorize 7.463828e+00 s (718.01 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.978611e-01 s Time for refinement 2.821363e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907845e-07 max(|| b_i - A x_i ||_1) 8.445278e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.061226e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907845e-07 max(|| b_i - A x_i ||_1) 8.445278e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.061226e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907845e-07 max(|| b_i - A x_i ||_1) 8.445278e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.061226e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907845e-07 max(|| b_i - A x_i ||_1) 8.445278e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.061226e+00 (SUCCESS) Start 2052: mpi_dst_example_simple_lap_s_facto1_sched0_kway_svdbegin 1664/3626 Test #2053: mpi_dst_example_simple_lap_s_facto1_sched0_kway_svdend ..................***Timeout 229.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.398025e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.066323e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.942912e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.240652e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.232371e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.223357e-01 s Time to initialize coeftab 1.099938e-01 s Time to factorize 3.851115e+00 s ( 1.36 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 4.495613e-01 s Time for refinement 7.620312e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.685047e-07 max(|| b_i - A x_i ||_1) 7.416855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.319949e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.685047e-07 max(|| b_i - A x_i ||_1) 7.416855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.319949e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.685047e-07 max(|| b_i - A x_i ||_1) 7.416855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.319949e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.685047e-07 max(|| b_i - A x_i ||_1) 7.416855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.319949e-01 (SUCCESS) Start 2053: mpi_dst_example_simple_lap_s_facto1_sched0_kway_svdend 1664/3626 Test #2054: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_svdbegin .....***Timeout 229.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.200362e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.791670e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.804594e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.069503e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.484382e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.672270e-01 s Time to initialize coeftab 2.070783e-01 s Time to factorize 1.080006e+00 s ( 4.85 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.113623e-02 s Time for refinement 1.079264e-02 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.898319e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.898319e-07 max(|| b_i - A x_i ||_1) 8.416911e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.057661e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.898319e-07 max(|| b_i - A x_i ||_1) 8.416911e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.057661e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.898319e-07 max(|| b_i - A x_i ||_1) 8.416911e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.057661e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.416911e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.057661e+00 (SUCCESS) Start 2054: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_svdbegin 1664/3626 Test #2056: mpi_dst_example_simple_lap_s_facto1_sched0_not_pqrcpbegin ...............***Timeout 229.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.928879e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.696173e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.600417e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.155871e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.411749e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.133518e-01 s Time to initialize coeftab 5.930664e-02 s Time to factorize 9.479599e-01 s ( 5.52 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.140853e-02 s - iteration 1 : total iteration time 0.00733 s error 5.5168e-11 Time for refinement 2.281562e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944054e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944054e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944054e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944054e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) Start 2056: mpi_dst_example_simple_lap_s_facto1_sched0_not_pqrcpbegin 1664/3626 Test #2057: mpi_dst_example_simple_lap_s_facto1_sched0_not_pqrcpend .................***Timeout 229.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.962009e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.918627e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.050310e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.067980e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.373158e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.463234e-01 s Time to initialize coeftab 1.329714e-01 s Time to factorize 7.607479e-01 s ( 6.88 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.184915e-01 s Time for refinement 3.053670e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.977971e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.977971e-07 max(|| b_i - A x_i ||_1) 8.565543e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076338e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.565543e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076338e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.977971e-07 max(|| b_i - A x_i ||_1) 8.565543e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076338e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.977971e-07 max(|| b_i - A x_i ||_1) 8.565543e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076338e+00 (SUCCESS) Start 2057: mpi_dst_example_simple_lap_s_facto1_sched0_not_pqrcpend 1664/3626 Test #2058: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpbegin ..............***Timeout 229.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.064046e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.837201e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.513717e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.580523e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.299805e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.202779e-01 s Time to initialize coeftab 2.089968e-01 s Time to factorize 5.753016e+00 s (931.52 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.539739e-01 s - iteration 1 : total iteration time 0.375 s error 5.5073e-11 Time for refinement 9.831648e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944046e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944046e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944046e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944046e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) Start 2058: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpbegin 1664/3626 Test #2060: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpbegin ...***Timeout 229.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.151952e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.005539e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.894036e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.103193e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.146186e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.898969e-01 s Time to initialize coeftab 1.238151e-01 s Time to factorize 4.377963e+00 s ( 1.20 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.123068e-02 s - iteration 1 : total iteration time 0.0397 s error 5.5168e-11 Time for refinement 1.073251e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944054e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944054e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944054e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944054e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) Start 2060: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpbegin Test #1669: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpbegin .......***Timeout 229.72 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.716038e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.421414e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.344629e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.713650e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.784544e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.712808e-02 s Time to initialize coeftab 1.068207e+00 s Time to factorize 4.097153e+00 s ( 5.20 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 1.392326e+00 s - iteration 1 : total iteration time 0.786 s error 1.4604e-14 Time for refinement 1.566119e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.460591e-14 max(|| b_i - A x_i ||_1) 2.520277e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.359514e-02 (SUCCESS) Start 1669: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpbegin 1664/3626 Test #2063: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrcpend .................***Timeout 228.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.326959e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.388002e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.738156e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.027871e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.878697e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.978747e-02 s Time to initialize coeftab 3.778114e-02 s Time to factorize 1.247541e+00 s ( 4.20 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 6.100796e-02 s Time for refinement 1.120708e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.972569e-07 max(|| b_i - A x_i ||_1) 8.513419e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069788e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.972569e-07 max(|| b_i - A x_i ||_1) 8.513419e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069788e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.972569e-07 max(|| b_i - A x_i ||_1) 8.513419e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069788e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.972569e-07 max(|| b_i - A x_i ||_1) 8.513419e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069788e+00 (SUCCESS) Start 2063: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrcpend Test #1670: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpend .........***Timeout 228.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.419389e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.126615e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.165474e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.816647e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.562875e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.292814e-01 s Time to initialize coeftab 5.775157e-02 s Time to factorize 3.126913e+00 s ( 6.81 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 7.593707e-01 s Time for refinement 7.118208e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.848886e-16 max(|| b_i - A x_i ||_1) 1.885516e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.757796e-03 (SUCCESS) Start 1670: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpend 1664/3626 Test #2064: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrcpbegin ..............***Timeout 228.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.557944e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.551879e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.062042e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.200020e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.389011e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.113250e-01 s Time to initialize coeftab 5.283707e-01 s Time to factorize 1.497167e+01 s (357.95 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 3.838552e-02 s - iteration 1 : total iteration time 0.048 s error 5.5128e-11 Time for refinement 1.362993e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782640e-08 max(|| b_i - A x_i ||_1) 2.847811e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.578532e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782640e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782640e-08 max(|| b_i - A x_i ||_1) 2.847811e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.578532e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782640e-08 max(|| b_i - A x_i ||_1) 2.847811e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.578532e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.847811e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.578532e-01 (SUCCESS) Start 2064: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrcpbegin 1664/3626 Test #2065: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrcpend ................***Timeout 228.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.933573e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.502837e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.534579e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.740127e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.435740e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.708220e-02 s Time to initialize coeftab 5.192755e-02 s Time to factorize 4.429622e-01 s (11.81 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.112404e-02 s Time for refinement 3.985529e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.978469e-07 max(|| b_i - A x_i ||_1) 8.532148e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.072141e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.978469e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.978469e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.978469e-07 max(|| b_i - A x_i ||_1) 8.532148e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.072141e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.532148e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.072141e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.532148e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.072141e+00 (SUCCESS) Start 2065: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrcpend 1664/3626 Test #2066: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpbegin ...***Timeout 227.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.793363e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.333150e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.581741e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.666747e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.796354e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.408208e-01 s Time to initialize coeftab 3.110611e-01 s Time to factorize 1.202530e+01 s (445.65 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 1.100058e-01 s - iteration 1 : total iteration time 0.196 s error 5.5126e-11 Time for refinement 3.469067e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782640e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782640e-08 max(|| b_i - A x_i ||_1) 2.847811e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.578532e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.847811e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.578532e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782640e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782640e-08 max(|| b_i - A x_i ||_1) 2.847811e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.578532e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.847811e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.578532e-01 (SUCCESS) Start 2066: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpbegin 1664/3626 Test #2068: mpi_dst_example_simple_lap_s_facto1_sched0_not_tqrcpbegin ...............***Timeout 227.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.063298e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.431207e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.823891e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.581187e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.394033e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.945486e-01 s Time to initialize coeftab 2.173386e-01 s Time to factorize 1.139424e+01 s (470.33 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 3.442416e-01 s - iteration 1 : total iteration time 0.828 s error 5.5059e-11 Time for refinement 1.602102e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.863290e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.863290e-08 max(|| b_i - A x_i ||_1) 2.888369e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629497e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.888369e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629497e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.863290e-08 max(|| b_i - A x_i ||_1) 2.888369e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629497e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.863290e-08 max(|| b_i - A x_i ||_1) 2.888369e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629497e-01 (SUCCESS) Start 2068: mpi_dst_example_simple_lap_s_facto1_sched0_not_tqrcpbegin 1664/3626 Test #2069: mpi_dst_example_simple_lap_s_facto1_sched0_not_tqrcpend .................***Timeout 227.07 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.834339e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.749237e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.507487e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.036640e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.547065e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.406103e-01 s Time to initialize coeftab 5.438553e-02 s Time to factorize 8.548532e-01 s ( 6.12 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.030285e-02 s Time for refinement 5.216515e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.972700e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.972700e-07 max(|| b_i - A x_i ||_1) 8.514203e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069887e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.514203e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069887e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.972700e-07 max(|| b_i - A x_i ||_1) 8.514203e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069887e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.972700e-07 max(|| b_i - A x_i ||_1) 8.514203e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069887e+00 (SUCCESS) Start 2069: mpi_dst_example_simple_lap_s_facto1_sched0_not_tqrcpend Test #1676: shm_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpend .........***Timeout 225.81 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.386503e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.104052e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.102085e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.573179e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.058997e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.093048e-01 s Time to initialize coeftab 1.989235e-01 s Time to factorize 1.933916e+00 s (11.02 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 1.156561e+00 s Time for refinement 3.894746e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.834020e-16 max(|| b_i - A x_i ||_1) 1.885839e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.758611e-03 (SUCCESS) Start 1676: shm_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpend 1664/3626 Test #2071: mpi_dst_example_simple_lap_s_facto1_sched0_kway_tqrcpend ................***Timeout 225.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.509679e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.672864e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.643533e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.130988e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.705923e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.023448e-01 s Time to initialize coeftab 7.922252e-02 s Time to factorize 2.520713e+00 s ( 2.08 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.046946e-01 s Time for refinement 2.366985e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.965841e-07 max(|| b_i - A x_i ||_1) 8.485211e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.066243e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.965841e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.965841e-07 max(|| b_i - A x_i ||_1) 8.485211e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.066243e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.485211e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.066243e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.965841e-07 max(|| b_i - A x_i ||_1) 8.485211e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.066243e+00 (SUCCESS) Start 2071: mpi_dst_example_simple_lap_s_facto1_sched0_kway_tqrcpend Test #1681: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtbegin .......***Timeout 225.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.415834e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.669716e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.197598e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.617980e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.767842e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.214741e-02 s Time to initialize coeftab 3.903075e-01 s Time to factorize 4.329496e+00 s ( 4.92 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 1.048743e+00 s - iteration 1 : total iteration time 0.606 s error 3.2141e-13 Time for refinement 1.153884e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.214109e-13 max(|| b_i - A x_i ||_1) 5.048470e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.273900e+00 (SUCCESS) Start 1681: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtbegin Test #1684: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpilu1 ...................***Timeout 224.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.574891e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.109722e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.175030e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.714302e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.983512e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.114080e-01 s Time to initialize coeftab 1.966208e-01 s Time to factorize 1.478713e+00 s (14.41 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 2.532411e+00 s - iteration 1 : total iteration time 0.828 s error 1.0742e-15 Time for refinement 1.598535e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.085418e-15 max(|| b_i - A x_i ||_1) 1.233627e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.112858e-03 (SUCCESS) Start 1684: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpilu1 1664/3626 Test #2072: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpbegin ...***Timeout 224.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.174973e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.437054e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.689430e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.177811e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.802515e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.467076e-01 s Time to initialize coeftab 3.072100e-01 s Time to factorize 3.314426e+00 s ( 1.58 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 6.372186e-01 s - iteration 1 : total iteration time 1.41 s error 5.4989e-11 Time for refinement 2.038831e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.843252e-08 max(|| b_i - A x_i ||_1) 2.875234e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.612991e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.843252e-08 max(|| b_i - A x_i ||_1) 2.875234e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.612991e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.843252e-08 max(|| b_i - A x_i ||_1) 2.875234e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.612991e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.843252e-08 max(|| b_i - A x_i ||_1) 2.875234e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.612991e-01 (SUCCESS) Start 2072: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpbegin 1664/3626 Test #2075: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrrtend .................***Timeout 221.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.460637e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.943854e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.710706e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.300635e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.372261e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.359506e-01 s Time to initialize coeftab 8.245829e-02 s Time to factorize 2.960171e+00 s ( 1.77 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.677343e-01 s Time for refinement 3.431853e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.915137e-07 max(|| b_i - A x_i ||_1) 8.567685e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076607e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.915137e-07 max(|| b_i - A x_i ||_1) 8.567685e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076607e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.915137e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.915137e-07 max(|| b_i - A x_i ||_1) 8.567685e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076607e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.567685e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076607e+00 (SUCCESS) Start 2075: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrrtend 1664/3626 Test #2082: mpi_dst_example_simple_lap_s_facto2_sched0_not_svdbegin .................***Timeout 214.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.786006e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.242357e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.289399e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.882487e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.900390e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.464604e-01 s Time to initialize coeftab 2.388171e-01 s Time to factorize 4.548769e+00 s ( 2.20 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 7.852251e-02 s Time for refinement 7.732587e-02 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.912910e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.912910e-07 max(|| b_i - A x_i ||_1) 8.325834e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.046216e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.325834e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.046216e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.912910e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.912910e-07 max(|| b_i - A x_i ||_1) 8.325834e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.046216e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.325834e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.046216e+00 (SUCCESS) Start 2082: mpi_dst_example_simple_lap_s_facto2_sched0_not_svdbegin 1664/3626 Test #2087: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_svdend .......***Timeout 213.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.504702e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.245497e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.622708e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.740520e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.290634e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.440209e-01 s Time to initialize coeftab 6.786447e-02 s Time to factorize 1.475086e+00 s ( 6.77 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.226992e-01 s Time for refinement 2.369113e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.698748e-07 max(|| b_i - A x_i ||_1) 7.340076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.223469e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.698748e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.698748e-07 max(|| b_i - A x_i ||_1) 7.340076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.223469e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.698748e-07 max(|| b_i - A x_i ||_1) 7.340076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.223469e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.340076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.223469e-01 (SUCCESS) Start 2087: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_svdend 1664/3626 Test #2088: mpi_dst_example_simple_lap_s_facto2_sched0_not_pqrcpbegin ...............***Timeout 213.32 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.068220e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.246586e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.709136e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.864377e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.122908e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.431444e-01 s Time to initialize coeftab 1.734331e-01 s Time to factorize 1.764111e+00 s ( 5.66 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.878880e-02 s - iteration 1 : total iteration time 0.04 s error 9.1129e-11 Time for refinement 9.334133e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.208992e-08 max(|| b_i - A x_i ||_1) 2.990471e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.757797e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.208992e-08 max(|| b_i - A x_i ||_1) 2.990471e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.757797e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.208992e-08 max(|| b_i - A x_i ||_1) 2.990471e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.757797e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.208992e-08 max(|| b_i - A x_i ||_1) 2.990471e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.757797e-01 (SUCCESS) Start 2088: mpi_dst_example_simple_lap_s_facto2_sched0_not_pqrcpbegin 1664/3626 Test #2089: mpi_dst_example_simple_lap_s_facto2_sched0_not_pqrcpend .................***Timeout 212.80 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.086037e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.624202e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.518978e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.883508e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.803239e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.827859e-01 s Time to initialize coeftab 1.381648e-01 s Time to factorize 2.126856e+00 s ( 4.69 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 8.153613e-02 s - iteration 1 : total iteration time 0.23 s error 1.5144e-12 Time for refinement 4.524240e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.641699e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.641699e-08 max(|| b_i - A x_i ||_1) 2.715983e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412878e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.715983e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412878e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.641699e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.641699e-08 max(|| b_i - A x_i ||_1) 2.715983e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412878e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.715983e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412878e-01 (SUCCESS) Start 2089: mpi_dst_example_simple_lap_s_facto2_sched0_not_pqrcpend Test #1707: shm_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpbegin .......***Timeout 212.32 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.102120e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.696054e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.334906e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 7.298094e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.508002e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.885979e-01 s Time to initialize coeftab 1.024455e+00 s Time to factorize 9.950793e+00 s ( 4.02 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 14.1 Ko Outside 16.9 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 1.49 Mo / 1.49 Mo ------------------------------------------------ Total 2.01 Mo / 2.01 Mo Time to solve 1.027952e+00 s - iteration 1 : total iteration time 0.817 s error 2.8954e-14 Time for refinement 2.091406e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.895173e-14 max(|| b_i - A x_i ||_1) 3.289654e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.300912e-02 (SUCCESS) Start 1707: shm_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpbegin Test #1710: shm_example_simple_lap_z_facto2_sched4_not_rqrrtend .....................***Timeout 211.74 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.792277e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.603946e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.824403e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.187036e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.047793e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.862306e-02 s Time to initialize coeftab 8.019223e-02 s Time to factorize 1.051314e+00 s (38.02 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 14.1 Ko Outside 16.9 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 1.49 Mo / 1.49 Mo ------------------------------------------------ Total 2.01 Mo / 2.01 Mo Time to solve 3.685821e-01 s Time for refinement 2.307069e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.658609e-16 max(|| b_i - A x_i ||_1) 1.764656e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.452825e-03 (SUCCESS) Start 1710: shm_example_simple_lap_z_facto2_sched4_not_rqrrtend 1664/3626 Test #2091: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpend ................***Timeout 210.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.966006e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.024378e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.164929e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.507910e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.449072e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.543364e-02 s Time to initialize coeftab 3.960597e-02 s Time to factorize 4.023254e-01 s (24.82 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.479178e-02 s - iteration 1 : total iteration time 0.0332 s error 1.4182e-12 Time for refinement 7.449090e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.638982e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.638982e-08 max(|| b_i - A x_i ||_1) 2.711602e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.407373e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.638982e-08 max(|| b_i - A x_i ||_1) 2.711602e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.407373e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.638982e-08 max(|| b_i - A x_i ||_1) 2.711602e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.407373e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.711602e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.407373e-01 (SUCCESS) Start 2091: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpend 1664/3626 Test #2093: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_pqrcpend .....***Timeout 206.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.021642e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.233838e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.669121e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.409180e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.484982e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.036474e-01 s Time to initialize coeftab 3.653287e-02 s Time to factorize 1.051335e+00 s ( 9.50 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 3.169029e-02 s - iteration 1 : total iteration time 0.0563 s error 1.4571e-12 Time for refinement 1.470435e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.644962e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.644962e-08 max(|| b_i - A x_i ||_1) 2.717530e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.414822e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.644962e-08 max(|| b_i - A x_i ||_1) 2.717530e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.414822e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.717530e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.414822e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.644962e-08 max(|| b_i - A x_i ||_1) 2.717530e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.414822e-01 (SUCCESS) Start 2093: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_pqrcpend 1664/3626 Test #2095: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrcpend .................***Timeout 202.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.680363e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.030547e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.482386e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.815622e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.060579e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.892533e-01 s Time to initialize coeftab 1.272299e-01 s Time to factorize 2.750499e+00 s ( 3.63 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.302617e-01 s - iteration 1 : total iteration time 0.157 s error 1.6167e-12 Time for refinement 4.397556e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.647298e-08 max(|| b_i - A x_i ||_1) 2.722673e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.421284e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.647298e-08 max(|| b_i - A x_i ||_1) 2.722673e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.421284e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.647298e-08 max(|| b_i - A x_i ||_1) 2.722673e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.421284e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.647298e-08 max(|| b_i - A x_i ||_1) 2.722673e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.421284e-01 (SUCCESS) Start 2095: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrcpend Test #1714: shm_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtend .........***Timeout 202.77 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.282349e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.168346e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.752171e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.278525e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.978205e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.414455e-01 s Time to initialize coeftab 5.013330e-02 s Time to factorize 1.944100e+00 s (20.56 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 14.1 Ko Outside 16.9 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 1.49 Mo / 1.49 Mo ------------------------------------------------ Total 2.01 Mo / 2.01 Mo Time to solve 7.787577e-01 s Time for refinement 3.202434e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.701929e-16 max(|| b_i - A x_i ||_1) 1.779814e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.491074e-03 (SUCCESS) Start 1714: shm_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtend Start 2135: mpi_dst_example_simple_lap_d_facto0_sched0_kway_tqrcpend Start 2136: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpbegin Start 2137: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpend Start 2138: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrrtbegin Start 2139: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrrtend Start 2140: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrrtbegin Start 2141: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrrtend Start 2142: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtbegin Start 2143: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtend Start 2144: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpilu0 Start 2145: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpilu1 Start 2146: mpi_dst_example_simple_lap_d_facto1_sched0_not_svdbegin Start 2147: mpi_dst_example_simple_lap_d_facto1_sched0_not_svdend Start 2148: mpi_dst_example_simple_lap_d_facto1_sched0_kway_svdbegin Start 2149: mpi_dst_example_simple_lap_d_facto1_sched0_kway_svdend Start 2150: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_svdbegin Start 2151: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_svdend Start 2152: mpi_dst_example_simple_lap_d_facto1_sched0_not_pqrcpbegin Start 2153: mpi_dst_example_simple_lap_d_facto1_sched0_not_pqrcpend Start 2154: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpbegin Start 2155: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpend Start 2156: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpbegin Start 2157: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpend Start 2158: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrcpbegin Start 2159: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrcpend Start 2160: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrcpbegin Start 2161: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrcpend Start 2162: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpbegin Start 2163: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpend Start 2164: mpi_dst_example_simple_lap_d_facto1_sched0_not_tqrcpbegin Start 2165: mpi_dst_example_simple_lap_d_facto1_sched0_not_tqrcpend Start 2166: mpi_dst_example_simple_lap_d_facto1_sched0_kway_tqrcpbegin Start 2167: mpi_dst_example_simple_lap_d_facto1_sched0_kway_tqrcpend Start 2168: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpbegin Start 2169: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpend Start 2170: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrrtbegin Start 2171: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrrtend Start 2172: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrrtbegin Start 2173: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrrtend Start 2174: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtbegin Start 2175: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtend 1664/3626 Test #2096: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrcpbegin .............. Passed 165.28 sec Test #1733: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpbegin ....... Passed 160.95 sec Test #1716: shm_example_simple_lap_z_facto2_sched4_kway_pqrcpilu1 ................... Passed 184.79 sec 1667/3626 Test #2102: mpi_dst_example_simple_lap_s_facto2_sched0_kway_tqrcpbegin .............. Passed 85.80 sec 1668/3626 Test #2098: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrcpbegin ... Passed 164.18 sec Test #1307: shm_example_simple_lap_s_facto1_sched4_not_pqrcpbegin ................... Passed 89.20 sec Test #1724: shm_example_simple_lap_z_facto3_sched4_not_pqrcpend ..................... Passed 184.14 sec Test #1803: c_mpi_rep_example_simple_lap_c_facto0 ................................... Passed 136.22 sec 1672/3626 Test #2103: mpi_dst_example_simple_lap_s_facto2_sched0_kway_tqrcpend ................ Passed 85.80 sec Test #1309: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpbegin .................. Passed 89.12 sec Test #1734: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpend ......... Passed 160.93 sec Test #1723: shm_example_simple_lap_z_facto3_sched4_not_pqrcpbegin ................... Passed 184.19 sec Test #1753: shm_example_simple_lap_z_facto4_sched4_kwayprojections_svdbegin ......... Passed 160.14 sec Test #1759: shm_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpbegin ....... Passed 159.96 sec Test #1722: shm_example_simple_lap_z_facto3_sched4_kwayprojections_svdend ........... Passed 184.26 sec Test #1245: shm_example_simple_lap_z_facto4_sched1_kway_pqrcpbegin .................. Passed 160.52 sec Test #1750: shm_example_simple_lap_z_facto4_sched4_not_svdend ....................... Passed 160.31 sec Test #1751: shm_example_simple_lap_z_facto4_sched4_kway_svdbegin .................... Passed 160.27 sec Test #1730: shm_example_simple_lap_z_facto3_sched4_not_rqrcpend ..................... Passed 161.09 sec Test #1738: shm_example_simple_lap_z_facto3_sched4_kway_tqrcpend .................... Passed 160.81 sec Test #1721: shm_example_simple_lap_z_facto3_sched4_kwayprojections_svdbegin ......... Passed 184.46 sec Test #1765: shm_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpbegin ....... Passed 159.79 sec Test #1741: shm_example_simple_lap_z_facto3_sched4_not_rqrrtbegin ................... Passed 160.72 sec Test #1317: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpbegin ....... Passed 88.25 sec Test #1744: shm_example_simple_lap_z_facto3_sched4_kway_rqrrtend .................... Passed 160.50 sec Test #1762: shm_example_simple_lap_z_facto4_sched4_not_rqrcpend ..................... Passed 165.95 sec Test #1718: shm_example_simple_lap_z_facto3_sched4_not_svdend ....................... Passed 184.70 sec Test #1742: shm_example_simple_lap_z_facto3_sched4_not_rqrrtend ..................... Passed 160.58 sec Test #1735: shm_example_simple_lap_z_facto3_sched4_not_tqrcpbegin ................... Passed 160.87 sec Test #1746: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtend ......... Passed 160.44 sec Test #1769: shm_example_simple_lap_z_facto4_sched4_kway_tqrcpbegin .................. Passed 85.39 sec Test #1772: shm_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpend ......... Passed 85.26 sec Test #1773: shm_example_simple_lap_z_facto4_sched4_not_rqrrtbegin ................... Passed 85.15 sec Test #1780: shm_example_simple_lap_z_facto4_sched4_kway_pqrcpilu1 ................... Passed 85.08 sec Test #1782: c_mpi_rep_example_analyze_lap_s_facto1 .................................. Passed 85.06 sec Test #1783: c_mpi_rep_example_analyze_lap_s_facto2 .................................. Passed 85.04 sec Test #1785: c_mpi_rep_example_analyze_lap_d_facto1 .................................. Passed 84.83 sec Test #1790: c_mpi_rep_example_analyze_lap_c_facto3 .................................. Passed 84.26 sec Test #1792: c_mpi_rep_example_analyze_lap_z_facto0 .................................. Passed 83.88 sec Test #1796: c_mpi_rep_example_analyze_lap_z_facto4 .................................. Passed 83.85 sec Test #1799: c_mpi_rep_example_simple_lap_s_facto2 ................................... Passed 83.80 sec Test #1289: shm_example_simple_lap_s_facto0_sched4_kway_tqrcpbegin .................. Passed 83.77 sec Test #1800: c_mpi_rep_example_simple_lap_d_facto0 ................................... Passed 83.74 sec Test #1804: c_mpi_rep_example_simple_lap_c_facto1 ................................... Passed 83.55 sec Test #1805: c_mpi_rep_example_simple_lap_c_facto2 ................................... Passed 83.47 sec Test #1808: c_mpi_rep_example_simple_lap_z_facto0 ................................... Passed 83.44 sec Test #1815: c_mpi_rep_example_simple_solve_and_refine_lap_s_facto2 .................. Passed 83.21 sec Test #1816: c_mpi_rep_example_simple_solve_and_refine_lap_d_facto0 .................. Passed 82.99 sec Test #1818: c_mpi_rep_example_simple_solve_and_refine_lap_d_facto2 .................. Passed 82.87 sec Test #1820: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto1 .................. Passed 82.78 sec Test #1821: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto2 .................. Passed 82.72 sec Test #1822: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto3 .................. Passed 82.69 sec Test #1825: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto1 .................. Passed 82.54 sec Test #1826: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto2 .................. Passed 82.48 sec Test #1827: c_mpi_rep_example_simple_solve_and_refine_lap_z_facto3 .................. Passed 82.32 sec Test #1831: c_mpi_rep_example_simple_trans_lap_s_facto2 ............................. Passed 82.19 sec Test #1832: c_mpi_rep_example_simple_trans_lap_d_facto0 ............................. Passed 81.96 sec Test #1833: c_mpi_rep_example_simple_trans_lap_d_facto1 ............................. Passed 81.89 sec Test #1834: c_mpi_rep_example_simple_trans_lap_d_facto2 ............................. Passed 81.86 sec Test #1836: c_mpi_rep_example_simple_trans_lap_c_facto1 ............................. Passed 81.67 sec Test #1842: c_mpi_rep_example_simple_trans_lap_z_facto2 ............................. Passed 81.47 sec Test #1844: c_mpi_rep_example_simple_trans_lap_z_facto4 ............................. Passed 81.44 sec Test #1310: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpend .................... Passed 81.42 sec Test #1316: shm_example_simple_lap_s_facto1_sched4_kway_rqrcpend .................... Passed 81.36 sec Test #1323: shm_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpbegin ....... Passed 81.30 sec Test #1325: shm_example_simple_lap_s_facto1_sched4_not_rqrrtbegin ................... Passed 81.29 sec Test #1327: shm_example_simple_lap_s_facto1_sched4_kway_rqrrtbegin .................. Passed 81.27 sec Test #1329: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtbegin ....... Passed 81.27 sec Test #1330: shm_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtend ......... Passed 81.25 sec Test #1331: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpilu0 ................... Passed 81.23 sec Test #1326: shm_example_simple_lap_s_facto1_sched4_not_rqrrtend ..................... Passed 81.22 sec Test #1332: shm_example_simple_lap_s_facto1_sched4_kway_pqrcpilu1 ................... Passed 81.21 sec Test #1335: shm_example_simple_lap_s_facto2_sched4_kway_svdbegin .................... Passed 79.17 sec Test #1340: shm_example_simple_lap_s_facto2_sched4_not_pqrcpend ..................... Passed 78.73 sec Test #1855: c_mpi_rep_example_step-by-step_lap_c_facto4 ............................. Passed 75.43 sec Test #1362: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtend ......... Passed 73.70 sec Test #1366: shm_example_simple_lap_d_facto0_sched4_not_svdend ....................... Passed 73.64 sec Test #1367: shm_example_simple_lap_d_facto0_sched4_kway_svdbegin .................... Passed 73.61 sec Test #1380: shm_example_simple_lap_d_facto0_sched4_kway_rqrcpend .................... Passed 60.40 sec Test #1394: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtend ......... Passed 60.01 sec Test #1398: shm_example_simple_lap_d_facto1_sched4_not_svdend ....................... Passed 59.92 sec Test #1400: shm_example_simple_lap_d_facto1_sched4_kway_svdend ...................... Passed 59.91 sec Test #1414: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpend ......... Passed 59.23 sec Test #1416: shm_example_simple_lap_d_facto1_sched4_not_tqrcpend ..................... Passed 58.89 sec Test #1420: shm_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpend ......... Passed 58.50 sec Test #1421: shm_example_simple_lap_d_facto1_sched4_not_rqrrtbegin ................... Passed 58.47 sec Test #1425: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtbegin ....... Passed 58.30 sec Test #1431: shm_example_simple_lap_d_facto2_sched4_kway_svdbegin .................... Passed 56.95 sec Test #1433: shm_example_simple_lap_d_facto2_sched4_kwayprojections_svdbegin ......... Passed 56.84 sec Test #1436: shm_example_simple_lap_d_facto2_sched4_not_pqrcpend ..................... Passed 56.78 sec Test #1438: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpend .................... Passed 56.42 sec Test #1441: shm_example_simple_lap_d_facto2_sched4_not_rqrcpbegin ................... Passed 56.30 sec Test #1442: shm_example_simple_lap_d_facto2_sched4_not_rqrcpend ..................... Passed 56.24 sec Test #1458: shm_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtend ......... Passed 55.43 sec Test #1468: shm_example_simple_lap_c_facto0_sched4_not_pqrcpend ..................... Passed 53.54 sec Test #1474: shm_example_simple_lap_c_facto0_sched4_not_rqrcpend ..................... Passed 50.61 sec Test #1484: shm_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpend ......... Passed 49.62 sec Test #1487: shm_example_simple_lap_c_facto0_sched4_kway_rqrrtbegin .................. Passed 49.61 sec Test #1489: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtbegin ....... Passed 49.59 sec Test #1492: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpilu1 ................... Passed 49.42 sec Test #1501: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpbegin .................. Passed 49.19 sec Test #1502: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpend .................... Passed 49.18 sec Test #1503: shm_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpbegin ....... Passed 49.15 sec Test #1506: shm_example_simple_lap_c_facto1_sched4_not_rqrcpend ..................... Passed 49.14 sec Test #1510: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpend ......... Passed 49.13 sec Test #1512: shm_example_simple_lap_c_facto1_sched4_not_tqrcpend ..................... Passed 49.05 sec Test #1516: shm_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpend ......... Passed 48.81 sec Test #1520: shm_example_simple_lap_c_facto1_sched4_kway_rqrrtend .................... Passed 48.48 sec Test #1862: c_mpi_rep_example_personal_lap_s_facto1 ................................. Passed 29.78 sec 1773/3626 Test #2006: mpi_dst_example_simple_lap_d_facto1_sched4_1d ...........................***Timeout 277.99 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2006: mpi_dst_example_simple_lap_d_facto1_sched4_1d Test #1649: shm_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtbegin .......***Timeout 250.29 sec Start 1649: shm_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtbegin Test #1163: shm_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpbegin .......***Timeout 239.10 sec 1774/3626 Test #2037: mpi_dst_example_simple_lap_s_facto0_sched0_not_tqrcpend .................***Timeout 233.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2037: mpi_dst_example_simple_lap_s_facto0_sched0_not_tqrcpend 1774/3626 Test #2042: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrrtbegin ...............***Timeout 233.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2042: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrrtbegin 1774/3626 Test #2044: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrrtbegin ..............***Timeout 233.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2044: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrrtbegin 1774/3626 Test #2045: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrrtend ................***Timeout 233.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2045: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrrtend 1774/3626 Test #2055: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_svdend .......***Timeout 232.27 sec Start 2055: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_svdend 1774/3626 Test #2059: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpend ................***Timeout 232.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2059: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpend Test #1668: shm_example_simple_lap_z_facto1_sched4_kway_rqrcpend ....................***Timeout 232.15 sec Start 1668: shm_example_simple_lap_z_facto1_sched4_kway_rqrcpend Test #1671: shm_example_simple_lap_z_facto1_sched4_not_tqrcpbegin ...................***Timeout 230.60 sec Start 1671: shm_example_simple_lap_z_facto1_sched4_not_tqrcpbegin 1774/3626 Test #2070: mpi_dst_example_simple_lap_s_facto1_sched0_kway_tqrcpbegin ..............***Timeout 229.13 sec Start 2070: mpi_dst_example_simple_lap_s_facto1_sched0_kway_tqrcpbegin 1774/3626 Test #2090: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpbegin ..............***Timeout 211.90 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2090: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpbegin 1774/3626 Test #2094: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrcpbegin ...............***Timeout 207.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2094: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrcpbegin Start 2176: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpilu0 Start 2177: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpilu1 Start 2178: mpi_dst_example_simple_lap_d_facto2_sched0_not_svdbegin Start 2179: mpi_dst_example_simple_lap_d_facto2_sched0_not_svdend Start 2180: mpi_dst_example_simple_lap_d_facto2_sched0_kway_svdbegin Start 2181: mpi_dst_example_simple_lap_d_facto2_sched0_kway_svdend Start 2182: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_svdbegin Start 2183: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_svdend Start 2184: mpi_dst_example_simple_lap_d_facto2_sched0_not_pqrcpbegin Start 2185: mpi_dst_example_simple_lap_d_facto2_sched0_not_pqrcpend Start 2186: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpbegin Start 2187: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpend Start 2188: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpbegin Start 2189: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpend Start 2190: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrcpbegin Start 2191: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrcpend Start 2192: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrcpbegin Start 2193: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrcpend Start 2194: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpbegin Start 2195: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpend Start 2196: mpi_dst_example_simple_lap_d_facto2_sched0_not_tqrcpbegin Start 2197: mpi_dst_example_simple_lap_d_facto2_sched0_not_tqrcpend Start 2198: mpi_dst_example_simple_lap_d_facto2_sched0_kway_tqrcpbegin Start 2199: mpi_dst_example_simple_lap_d_facto2_sched0_kway_tqrcpend Start 2200: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpbegin Start 2201: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpend Start 2202: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrrtbegin Start 2203: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrrtend Start 2204: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrrtbegin Start 2205: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrrtend Start 2206: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtbegin Start 2207: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtend Start 2208: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpilu0 Start 2209: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpilu1 Start 2210: mpi_dst_example_simple_lap_c_facto0_sched0_not_svdbegin Start 2211: mpi_dst_example_simple_lap_c_facto0_sched0_not_svdend Start 2212: mpi_dst_example_simple_lap_c_facto0_sched0_kway_svdbegin Start 2213: mpi_dst_example_simple_lap_c_facto0_sched0_kway_svdend Start 2214: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_svdbegin Start 2215: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_svdend Start 2216: mpi_dst_example_simple_lap_c_facto0_sched0_not_pqrcpbegin Start 2217: mpi_dst_example_simple_lap_c_facto0_sched0_not_pqrcpend Start 2218: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpbegin Start 2219: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpend Start 2220: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpbegin Start 2221: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpend Start 2222: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrcpbegin Start 2223: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrcpend Start 2224: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrcpbegin Start 2225: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrcpend Start 2226: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpbegin Start 2227: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpend Start 2228: mpi_dst_example_simple_lap_c_facto0_sched0_not_tqrcpbegin Start 2229: mpi_dst_example_simple_lap_c_facto0_sched0_not_tqrcpend Start 2230: mpi_dst_example_simple_lap_c_facto0_sched0_kway_tqrcpbegin Start 2231: mpi_dst_example_simple_lap_c_facto0_sched0_kway_tqrcpend Start 2232: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpbegin Start 2233: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpend Start 2234: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrrtbegin Start 2235: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrrtend Start 2236: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrrtbegin Start 2237: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrrtend Start 2238: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtbegin Start 2239: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtend Start 2240: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpilu0 Start 2241: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpilu1 Start 2242: mpi_dst_example_simple_lap_c_facto1_sched0_not_svdbegin Start 2243: mpi_dst_example_simple_lap_c_facto1_sched0_not_svdend Start 2244: mpi_dst_example_simple_lap_c_facto1_sched0_kway_svdbegin Start 2245: mpi_dst_example_simple_lap_c_facto1_sched0_kway_svdend Start 2246: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_svdbegin Start 2247: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_svdend Start 2248: mpi_dst_example_simple_lap_c_facto1_sched0_not_pqrcpbegin Start 2249: mpi_dst_example_simple_lap_c_facto1_sched0_not_pqrcpend Start 2250: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpbegin Start 2251: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpend Start 2252: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpbegin Start 2253: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpend Start 2254: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrcpbegin Start 2255: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrcpend Start 2256: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrcpbegin Start 2257: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrcpend Start 2258: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpbegin Start 2259: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpend Start 2260: mpi_dst_example_simple_lap_c_facto1_sched0_not_tqrcpbegin Start 2261: mpi_dst_example_simple_lap_c_facto1_sched0_not_tqrcpend Start 2262: mpi_dst_example_simple_lap_c_facto1_sched0_kway_tqrcpbegin Start 2263: mpi_dst_example_simple_lap_c_facto1_sched0_kway_tqrcpend Start 2264: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpbegin Start 2265: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpend Start 2266: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrrtbegin Start 2267: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrrtend Start 2268: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrrtbegin Start 2269: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrrtend Start 2270: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtbegin Start 2271: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtend Start 2272: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpilu0 Start 2273: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpilu1 Start 2274: mpi_dst_example_simple_lap_c_facto2_sched0_not_svdbegin Start 2275: mpi_dst_example_simple_lap_c_facto2_sched0_not_svdend Start 2276: mpi_dst_example_simple_lap_c_facto2_sched0_kway_svdbegin Start 2277: mpi_dst_example_simple_lap_c_facto2_sched0_kway_svdend Start 2278: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_svdbegin Start 2279: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_svdend Start 2280: mpi_dst_example_simple_lap_c_facto2_sched0_not_pqrcpbegin Start 2281: mpi_dst_example_simple_lap_c_facto2_sched0_not_pqrcpend Start 2282: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpbegin Start 2283: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpend Start 2284: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpbegin Start 2285: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpend 1774/3626 Test #2101: mpi_dst_example_simple_lap_s_facto2_sched0_not_tqrcpend ................. Passed 88.84 sec Test #1766: shm_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpend ......... Passed 162.62 sec Test #1838: c_mpi_rep_example_simple_trans_lap_c_facto3 ............................. Passed 84.42 sec Start 2286: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrcpbegin Start 2287: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrcpend Start 2288: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrcpbegin Test #1755: shm_example_simple_lap_z_facto4_sched4_not_pqrcpbegin ................... Passed 163.18 sec Test #1726: shm_example_simple_lap_z_facto3_sched4_kway_pqrcpend .................... Passed 164.40 sec Start 2289: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrcpend Start 2290: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpbegin Test #1812: c_mpi_rep_example_simple_lap_z_facto4 ................................... Passed 86.89 sec Start 2291: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpend Test #1757: shm_example_simple_lap_z_facto4_sched4_kway_pqrcpbegin .................. Passed 168.47 sec Start 2292: mpi_dst_example_simple_lap_c_facto2_sched0_not_tqrcpbegin Test #1524: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpilu1 ................... Passed 54.97 sec Start 2293: mpi_dst_example_simple_lap_c_facto2_sched0_not_tqrcpend Test #1349: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpbegin ....... Passed 89.19 sec Start 2294: mpi_dst_example_simple_lap_c_facto2_sched0_kway_tqrcpbegin Test #1429: shm_example_simple_lap_d_facto2_sched4_not_svdbegin ..................... Passed 67.62 sec Start 2295: mpi_dst_example_simple_lap_c_facto2_sched0_kway_tqrcpend Test #1830: c_mpi_rep_example_simple_trans_lap_s_facto1 ............................. Passed 93.08 sec Start 2296: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpbegin 1785/3626 Test #2097: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrcpend ................ Passed 176.15 sec Start 2297: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpend 1786/3626 Test #2099: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrcpend ..... Passed 175.53 sec Start 2298: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrrtbegin 1787/3626 Test #2100: mpi_dst_example_simple_lap_s_facto2_sched0_not_tqrcpbegin ............... Passed 175.54 sec Start 2299: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrrtend Test #1720: shm_example_simple_lap_z_facto3_sched4_kway_svdend ......................***Timeout 199.42 sec Start 1720: shm_example_simple_lap_z_facto3_sched4_kway_svdend Test #1226: shm_example_simple_lap_z_facto3_sched1_kway_tqrcpend ....................***Timeout 199.55 sec ischedInit: The thread number has been automatically set to 256 Start 2300: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrrtbegin Test #1835: c_mpi_rep_example_simple_trans_lap_c_facto0 ............................. Passed 98.22 sec Start 2301: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrrtend Test #1871: c_mpi_rep_example_personal_lap_c_facto4 ................................. Passed 46.13 sec Start 2302: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtbegin Test #1456: shm_example_simple_lap_d_facto2_sched4_kway_rqrrtend .................... Passed 74.62 sec Start 2303: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtend Test #1760: shm_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpend ......... Passed 183.27 sec Start 2304: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpilu0 Test #1884: c_mpi_rep_example_simple_single_mm2 ..................................... Passed 48.36 sec Start 2305: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpilu1 Test #1413: shm_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpbegin ....... Passed 86.23 sec Start 2306: mpi_dst_example_simple_lap_c_facto3_sched0_not_svdbegin Test #1451: shm_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpbegin ....... Passed 85.59 sec Start 2307: mpi_dst_example_simple_lap_c_facto3_sched0_not_svdend Test #1483: shm_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpbegin ....... Passed 79.33 sec Start 2308: mpi_dst_example_simple_lap_c_facto3_sched0_kway_svdbegin Test #1802: c_mpi_rep_example_simple_lap_d_facto2 ................................... Passed 115.93 sec Start 2309: mpi_dst_example_simple_lap_c_facto3_sched0_kway_svdend Test #1432: shm_example_simple_lap_d_facto2_sched4_kway_svdend ...................... Passed 90.14 sec Start 2310: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_svdbegin Test #1434: shm_example_simple_lap_d_facto2_sched4_kwayprojections_svdend ........... Passed 90.11 sec Start 2311: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_svdend Test #1740: shm_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpend ......... Passed 195.63 sec Start 2312: mpi_dst_example_simple_lap_c_facto3_sched0_not_pqrcpbegin Test #1240: shm_example_simple_lap_z_facto4_sched1_kway_svdend ......................***Timeout 198.61 sec Test #1243: shm_example_simple_lap_z_facto4_sched1_not_pqrcpbegin ...................***Timeout 198.36 sec Test #1745: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtbegin .......***Timeout 198.18 sec Start 1745: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtbegin Test #1747: shm_example_simple_lap_z_facto3_sched4_kway_pqrcpilu0 ...................***Timeout 198.13 sec Start 1747: shm_example_simple_lap_z_facto3_sched4_kway_pqrcpilu0 Test #1752: shm_example_simple_lap_z_facto4_sched4_kway_svdend ......................***Timeout 197.95 sec Start 1752: shm_example_simple_lap_z_facto4_sched4_kway_svdend Test #1886: c_mpi_rep_example_step-by-step_single_mm ................................ Passed 60.47 sec Test #1756: shm_example_simple_lap_z_facto4_sched4_not_pqrcpend .....................***Timeout 197.81 sec Start 1756: shm_example_simple_lap_z_facto4_sched4_not_pqrcpend Start 2313: mpi_dst_example_simple_lap_c_facto3_sched0_not_pqrcpend Start 2314: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpbegin Start 2315: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpend Test #1877: c_mpi_rep_example_simple_scotch_rsa ..................................... Passed 62.20 sec Start 2316: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpbegin Test #1823: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto4 ..................***Timeout 156.75 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1823: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto4 Test #1811: c_mpi_rep_example_simple_lap_z_facto3 ................................... Passed 127.80 sec Start 2317: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpend Test #1336: shm_example_simple_lap_s_facto2_sched4_kway_svdend ...................... Passed 123.83 sec Start 2318: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrcpbegin Test #1480: shm_example_simple_lap_c_facto0_sched4_not_tqrcpend ..................... Passed 95.98 sec Start 2319: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrcpend Test #1819: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto0 .................. Passed 131.86 sec Start 2320: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrcpbegin Test #1889: c_mpi_rep_example_simple_refine_cg ...................................... Passed 74.18 sec Start 2321: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrcpend Test #1890: c_mpi_rep_example_simple_refine_gmres ................................... Passed 76.28 sec Start 2322: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpbegin Test #1450: shm_example_simple_lap_d_facto2_sched4_kway_tqrcpend .................... Passed 86.81 sec Start 2323: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpend Test #1423: shm_example_simple_lap_d_facto1_sched4_kway_rqrrtbegin .................. Passed 119.59 sec Start 2324: mpi_dst_example_simple_lap_c_facto3_sched0_not_tqrcpbegin Test #1797: c_mpi_rep_example_simple_lap_s_facto0 ................................... Passed 149.28 sec Start 2325: mpi_dst_example_simple_lap_c_facto3_sched0_not_tqrcpend Test #1464: shm_example_simple_lap_c_facto0_sched4_kway_svdend ...................... Passed 121.02 sec Start 2326: mpi_dst_example_simple_lap_c_facto3_sched0_kway_tqrcpbegin Test #1902: c_mpi_rep_example_refinement_lap_c_refine_gmres_sym ..................... Passed 87.68 sec Start 2327: mpi_dst_example_simple_lap_c_facto3_sched0_kway_tqrcpend Test #1893: c_mpi_rep_example_refinement_lap_s_refine_gmres_sym ..................... Passed 92.67 sec Start 2328: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpbegin Test #1527: shm_example_simple_lap_c_facto2_sched4_kway_svdbegin .................... Passed 99.78 sec Start 2329: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpend Test #1430: shm_example_simple_lap_d_facto2_sched4_not_svdend ....................... Passed 128.78 sec Start 2330: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrrtbegin Test #1898: c_mpi_rep_example_refinement_lap_c_refine_cg_her ........................ Passed 93.64 sec Start 2331: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrrtend Test #1897: c_mpi_rep_example_refinement_lap_d_refine_bicgstab_sym .................. Passed 95.40 sec Start 2332: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrrtbegin Test #1899: c_mpi_rep_example_refinement_lap_c_refine_gmres_her ..................... Passed 95.33 sec Start 2333: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrrtend Test #1924: mpi_rep_example_simple_lap_s_facto2_sched0_1d ........................... Passed 91.56 sec Start 2334: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtbegin Test #1867: c_mpi_rep_example_personal_lap_c_facto0 ................................. Passed 103.92 sec Start 2335: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtend Test #1417: shm_example_simple_lap_d_facto1_sched4_kway_tqrcpbegin .................. Passed 141.72 sec Start 2336: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpilu0 Test #1469: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpbegin .................. Passed 142.61 sec Start 2337: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpilu1 Test #1459: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpilu0 ................... Passed 148.86 sec Start 2338: mpi_dst_example_simple_lap_c_facto4_sched0_not_svdbegin Test #1974: mpi_dst_example_simple_lap_d_facto1_sched0_1d ........................... Passed 103.85 sec Start 2339: mpi_dst_example_simple_lap_c_facto4_sched0_not_svdend Test #1930: mpi_rep_example_simple_lap_c_facto2_sched0_1d ........................... Passed 109.58 sec Start 2340: mpi_dst_example_simple_lap_c_facto4_sched0_kway_svdbegin Test #1447: shm_example_simple_lap_d_facto2_sched4_not_tqrcpbegin ................... Passed 152.41 sec Start 2341: mpi_dst_example_simple_lap_c_facto4_sched0_kway_svdend Test #1385: shm_example_simple_lap_d_facto0_sched4_kway_tqrcpbegin .................. Passed 156.61 sec Start 2342: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_svdbegin Test #1948: mpi_rep_example_simple_lap_c_facto4_sched1_1d ........................... Passed 109.83 sec Start 2343: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_svdend Test #1959: mpi_rep_example_simple_lap_d_facto2_sched4_1d ........................... Passed 109.47 sec Start 2344: mpi_dst_example_simple_lap_c_facto4_sched0_not_pqrcpbegin Test #1363: shm_example_simple_lap_s_facto2_sched4_kway_pqrcpilu0 ................... Passed 170.97 sec Start 2345: mpi_dst_example_simple_lap_c_facto4_sched0_not_pqrcpend Test #1975: mpi_dst_example_simple_lap_d_facto2_sched0_1d ........................... Passed 125.10 sec Start 2346: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpbegin Test #1313: shm_example_simple_lap_s_facto1_sched4_not_rqrcpbegin ...................***Timeout 197.80 sec Start 2347: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpend Test #1943: mpi_rep_example_simple_lap_d_facto2_sched1_1d ........................... Passed 130.11 sec Start 2348: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpbegin Test #1895: c_mpi_rep_example_refinement_lap_d_refine_cg_sym ........................ Passed 138.86 sec Start 2349: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpend Test #1444: shm_example_simple_lap_d_facto2_sched4_kway_rqrcpend ....................***Timeout 238.74 sec Test #1460: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpilu1 ...................***Timeout 237.87 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Test #1477: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpbegin .......***Timeout 232.67 sec Test #1888: c_mpi_rep_example_step-by-step_single_mm2 ...............................***Timeout 205.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: IJV N: 1280 nnz: 12029 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.586301e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 10749 Fill-in of L 0.893590 Time to compute symbol matrix 6.862401e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.296985e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 21498 Fill-in 1.787181 Number of operations in full-rank: LU 1.08 MFlops Prediction: Model AMD 6180 MKL Time to factorize 8.513996e-04 s Time for mapping/scheduling 1.680736e-01 s Time to initialize internal csc 1.739767e-02 s Time to initialize coeftab 1.898488e-02 s Time to factorize 2.378323e-01 s ( 4.56 MFlop/s) Number of operations 4.80 MFlops Number of static pivots 0 Memory usage of coeftab 427 Ko Time to solve 5.656615e-01 s WARNING: WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 7.847276e+00 s || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) Time to solve 2.664463e-01 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 3.319651e+00 s || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) Time to initialize internal csc 4.476576e-02 s Time to initialize coeftab 7.094129e-02 s Time to factorize 6.761327e-01 s ( 1.60 MFlop/s) Number of operations 4.80 MFlops Number of static pivots 0 Memory usage of coeftab 427 Ko Time to solve 4.924756e+00 s WARNING: WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 9.549302e+00 s || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) Time to solve 1.323844e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 1888: c_mpi_rep_example_step-by-step_single_mm2 Test #1903: c_mpi_rep_example_refinement_lap_c_refine_bicgstab_sym ..................***Timeout 203.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.006946e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.210053e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.748643e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.118428e-01 s Time to initialize internal csc 2.481648e-03 s - iteration 1 : total iteration time 2.17 s error 0.067713 - iteration 2 : total iteration time 1.57 s error 0.010548 - iteration 3 : total iteration time 8.33 s error 0.002058 - iteration 4 : total iteration time 1.74 s error 0.00043473 - iteration 5 : total iteration time 4.76 s error 9.1546e-05 - iteration 6 : total iteration time 2.41 s error 1.8273e-05 - iteration 7 : total iteration time 2.87 s error 3.4435e-06 - iteration 8 : total iteration time 1.07 s error 5.9495e-07 Time for refinement 2.566156e+01 s Start 1903: c_mpi_rep_example_refinement_lap_c_refine_bicgstab_sym Test #1907: c_mpi_rep_example_refinement_lap_z_refine_cg_sym ........................***Timeout 202.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.409492e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.239479e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.061532e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.591753e+00 s Time to initialize internal csc 3.226697e-03 s - iteration 1 : total iteration time 0.65 s error 0.20457 - iteration 2 : total iteration time 1.69 s error 0.058883 - iteration 3 : total iteration time 0.627 s error 0.018804 - iteration 4 : total iteration time 0.62 s error 0.0064705 - iteration 5 : total iteration time 1.76 s error 0.0022688 - iteration 6 : total iteration time 0.735 s error 0.00080218 Start 1907: c_mpi_rep_example_refinement_lap_z_refine_cg_sym Test #1912: c_mpi_rep_example_simple_mixed_refine_bicgstab ..........................***Timeout 202.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.115385e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 51109 Fill-in of L 7.452464 Time to compute symbol matrix 5.389241e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.436364e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 102218 Fill-in 14.904929 Number of operations in full-rank: LU 5.50 MFlops Prediction: Model AMD 6180 MKL Time to factorize 7.121319e-04 s Time for mapping/scheduling 5.783852e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.190614e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.063639e-02 s Time to initialize coeftab 1.470427e-01 s Time to factorize 5.832934e-01 s ( 9.43 MFlop/s) Number of operations 9.09 MFlops Number of static pivots 0 Memory usage of coeftab 150 Ko Time to solve 1.139439e+00 s Start 1912: c_mpi_rep_example_simple_mixed_refine_bicgstab Test #1919: c_mpi_rep_example_simple_mixed_lap_z_refine_cg_sym ......................***Timeout 200.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.069908e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.278368e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.344501e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.507932e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.335787e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.073069e-03 s Time to initialize coeftab 9.352848e-01 s Time to factorize 1.125091e+00 s (35.53 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 2.248415e+00 s Start 1919: c_mpi_rep_example_simple_mixed_lap_z_refine_cg_sym Test #1541: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpbegin .......***Timeout 196.52 sec Test #1935: mpi_rep_example_simple_lap_z_facto2_sched0_1d ...........................***Timeout 196.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1935: mpi_rep_example_simple_lap_z_facto2_sched0_1d Test #1939: mpi_rep_example_simple_lap_s_facto1_sched1_1d ...........................***Timeout 196.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.572457e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.683509e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.127433e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.340063e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.014986e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.219658e-02 s Time to initialize coeftab 7.401759e-02 s Time to factorize 5.865288e-01 s ( 8.92 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 68.5 Ko Time to solve 1.855019e+00 s Time for refinement 2.216363e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.756020e-07 max(|| b_i - A x_i ||_1) 7.765723e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.758134e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.756020e-07 max(|| b_i - A x_i ||_1) 7.765723e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.758134e-01 (SUCCESS) max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.756020e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.756020e-07 max(|| b_i - A x_i ||_1) 7.765723e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.758134e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.765723e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.758134e-01 (SUCCESS) Start 1939: mpi_rep_example_simple_lap_s_facto1_sched1_1d Test #1543: shm_example_simple_lap_c_facto2_sched4_not_tqrcpbegin ...................***Timeout 196.12 sec Test #1944: mpi_rep_example_simple_lap_c_facto0_sched1_1d ...........................***Timeout 195.87 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1944: mpi_rep_example_simple_lap_c_facto0_sched1_1d Test #1952: mpi_rep_example_simple_lap_z_facto3_sched1_1d ...........................***Timeout 195.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.112728e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.071976e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.768258e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.735354e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.507575e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.955119e-02 s Time to initialize coeftab 1.793409e-01 s Time to factorize 7.564635e-01 s (26.81 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 1.382341e+00 s Time for refinement 9.743231e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.254925e-16 max(|| b_i - A x_i ||_1) 1.825499e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.606352e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.254925e-16 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_1) 1.825499e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.606352e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.254925e-16 max(|| b_i - A x_i ||_1) 1.825499e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.606352e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.254925e-16 max(|| b_i - A x_i ||_1) 1.825499e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.606352e-03 (SUCCESS) Start 1952: mpi_rep_example_simple_lap_z_facto3_sched1_1d Test #1956: mpi_rep_example_simple_lap_s_facto2_sched4_1d ...........................***Timeout 195.12 sec ischedInit: The thread number has been automatically set to 256 Start 1956: mpi_rep_example_simple_lap_s_facto2_sched4_1d Test #1957: mpi_rep_example_simple_lap_d_facto0_sched4_1d ...........................***Timeout 195.11 sec Start 1957: mpi_rep_example_simple_lap_d_facto0_sched4_1d Test #1961: mpi_rep_example_simple_lap_c_facto1_sched4_1d ...........................***Timeout 194.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.255471e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.606172e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.588481e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.747130e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.323689e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.057968e-02 s Time to initialize coeftab 2.897643e-02 s Time to factorize 1.035566e+00 s (20.58 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 2.881916e+00 s Time for refinement 1.090328e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.821951e-07 max(|| b_i - A x_i ||_1) 8.039148e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.028518e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.821951e-07 max(|| b_i - A x_i ||_1) 8.039148e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.028518e+00 (SUCCESS) || A ||_1 5.112398e-02 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.821951e-07 max(|| b_i - A x_i ||_1) 8.039148e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.028518e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.821951e-07 max(|| b_i - A x_i ||_1) 8.039148e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.028518e+00 (SUCCESS) Start 1961: mpi_rep_example_simple_lap_c_facto1_sched4_1d Test #1964: mpi_rep_example_simple_lap_c_facto4_sched4_1d ...........................***Timeout 193.96 sec Start 1964: mpi_rep_example_simple_lap_c_facto4_sched4_1d Test #1553: shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtbegin .......***Timeout 193.83 sec Test #1971: mpi_dst_example_simple_lap_s_facto1_sched0_1d ...........................***Timeout 192.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.219771e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.245585e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.415184e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 1971: mpi_dst_example_simple_lap_s_facto1_sched0_1d Test #1976: mpi_dst_example_simple_lap_c_facto0_sched0_1d ...........................***Timeout 191.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1976: mpi_dst_example_simple_lap_c_facto0_sched0_1d Test #1979: mpi_dst_example_simple_lap_c_facto3_sched0_1d ...........................***Timeout 191.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1979: mpi_dst_example_simple_lap_c_facto3_sched0_1d Test #1980: mpi_dst_example_simple_lap_c_facto4_sched0_1d ...........................***Timeout 191.70 sec Start 1980: mpi_dst_example_simple_lap_c_facto4_sched0_1d Test #1566: shm_example_simple_lap_c_facto3_sched4_kway_pqrcpend ....................***Timeout 191.70 sec Test #1981: mpi_dst_example_simple_lap_z_facto0_sched0_1d ...........................***Timeout 191.69 sec Start 1981: mpi_dst_example_simple_lap_z_facto0_sched0_1d Test #1982: mpi_dst_example_simple_lap_z_facto1_sched0_1d ...........................***Timeout 191.68 sec Start 1982: mpi_dst_example_simple_lap_z_facto1_sched0_1d Test #1571: shm_example_simple_lap_c_facto3_sched4_kway_rqrcpbegin ..................***Timeout 191.68 sec Test #1983: mpi_dst_example_simple_lap_z_facto2_sched0_1d ...........................***Timeout 191.67 sec Start 1983: mpi_dst_example_simple_lap_z_facto2_sched0_1d Test #1986: mpi_dst_example_simple_lap_s_facto0_sched1_1d ...........................***Timeout 191.61 sec Start 1986: mpi_dst_example_simple_lap_s_facto0_sched1_1d Test #1991: mpi_dst_example_simple_lap_d_facto2_sched1_1d ...........................***Timeout 191.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1991: mpi_dst_example_simple_lap_d_facto2_sched1_1d Test #1992: mpi_dst_example_simple_lap_c_facto0_sched1_1d ...........................***Timeout 191.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.260351e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.179048e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.028174e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.437952e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.387606e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.099890e-01 s Time to initialize coeftab 2.594016e-01 s Time to factorize 1.574205e+00 s (12.88 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Start 1992: mpi_dst_example_simple_lap_c_facto0_sched1_1d Test #1993: mpi_dst_example_simple_lap_c_facto1_sched1_1d ...........................***Timeout 191.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.922577e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.093622e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.301270e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.243329e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.163538e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.318838e-01 s Time to initialize coeftab 3.903669e-02 s Time to factorize 1.263047e+00 s (16.87 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 3.358082e+00 s Time for refinement 2.251679e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.799538e-07 max(|| b_i - A x_i ||_1) 7.850311e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.980907e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.799538e-07 max(|| b_i - A x_i ||_1) 7.850311e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.980907e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.799538e-07 max(|| b_i - A x_i ||_1) 7.850311e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.980907e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.799538e-07 max(|| b_i - A x_i ||_1) 7.850311e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.980907e+00 (SUCCESS) Start 1993: mpi_dst_example_simple_lap_c_facto1_sched1_1d Test #2003: mpi_dst_example_simple_lap_s_facto1_sched4_1d ...........................***Timeout 190.60 sec Start 2003: mpi_dst_example_simple_lap_s_facto1_sched4_1d Test #1591: shm_example_simple_lap_c_facto4_sched4_kway_svdbegin ....................***Timeout 190.58 sec Test #2004: mpi_dst_example_simple_lap_s_facto2_sched4_1d ...........................***Timeout 190.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.186036e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.293021e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.953831e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.166217e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.659623e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.714663e-01 s Time to initialize coeftab 1.178473e+00 s Time to factorize 9.912895e-01 s (10.07 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Start 2004: mpi_dst_example_simple_lap_s_facto2_sched4_1d 1847/3626 Test #2104: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpbegin ...***Timeout 190.60 sec Start 2104: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpbegin 1847/3626 Test #2105: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpend .....***Timeout 190.63 sec Start 2105: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpend 1847/3626 Test #2106: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrrtbegin ...............***Timeout 190.64 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 Start 2106: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrrtbegin 1847/3626 Test #2107: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrrtend .................***Timeout 190.65 sec Start 2107: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrrtend 1847/3626 Test #2108: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrrtbegin ..............***Timeout 190.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Start 2108: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrrtbegin 1847/3626 Test #2109: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrrtend ................***Timeout 190.70 sec Start 2109: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrrtend 1847/3626 Test #2110: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtbegin ...***Timeout 190.72 sec Start 2110: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtbegin 1847/3626 Test #2111: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtend .....***Timeout 190.73 sec Start 2111: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtend 1847/3626 Test #2112: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpilu0 ...............***Timeout 190.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.110555e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.734315e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.418471e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.260169e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.385894e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.087000e-01 s Time to initialize coeftab 7.965306e-01 s Time to factorize 4.563097e+00 s ( 2.19 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Start 2112: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpilu0 1847/3626 Test #2113: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpilu1 ...............***Timeout 190.76 sec Start 2113: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpilu1 1847/3626 Test #2114: mpi_dst_example_simple_lap_d_facto0_sched0_not_svdbegin .................***Timeout 190.77 sec Start 2114: mpi_dst_example_simple_lap_d_facto0_sched0_not_svdbegin 1847/3626 Test #2115: mpi_dst_example_simple_lap_d_facto0_sched0_not_svdend ...................***Timeout 190.79 sec Start 2115: mpi_dst_example_simple_lap_d_facto0_sched0_not_svdend 1847/3626 Test #2116: mpi_dst_example_simple_lap_d_facto0_sched0_kway_svdbegin ................***Timeout 190.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2116: mpi_dst_example_simple_lap_d_facto0_sched0_kway_svdbegin 1847/3626 Test #2117: mpi_dst_example_simple_lap_d_facto0_sched0_kway_svdend ..................***Timeout 190.82 sec Start 2117: mpi_dst_example_simple_lap_d_facto0_sched0_kway_svdend 1847/3626 Test #2118: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_svdbegin .....***Timeout 190.86 sec Start 2118: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_svdbegin 1847/3626 Test #2119: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_svdend .......***Timeout 190.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.795117e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.753469e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.117896e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.882733e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.330843e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.064166e+00 s Time to initialize coeftab 9.553563e-01 s Start 2119: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_svdend 1847/3626 Test #2120: mpi_dst_example_simple_lap_d_facto0_sched0_not_pqrcpbegin ...............***Timeout 190.96 sec Start 2120: mpi_dst_example_simple_lap_d_facto0_sched0_not_pqrcpbegin 1847/3626 Test #2121: mpi_dst_example_simple_lap_d_facto0_sched0_not_pqrcpend .................***Timeout 191.00 sec Start 2121: mpi_dst_example_simple_lap_d_facto0_sched0_not_pqrcpend 1847/3626 Test #2122: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpbegin ..............***Timeout 191.04 sec Start 2122: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpbegin 1847/3626 Test #2123: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpend ................***Timeout 191.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.060678e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Start 2123: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpend 1847/3626 Test #2124: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpbegin ...***Timeout 191.09 sec Start 2124: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpbegin 1847/3626 Test #2125: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpend .....***Timeout 191.10 sec Start 2125: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpend 1847/3626 Test #2126: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrcpbegin ...............***Timeout 191.10 sec Start 2126: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrcpbegin 1847/3626 Test #2127: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrcpend .................***Timeout 191.12 sec Start 2127: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrcpend 1847/3626 Test #2128: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrcpbegin ..............***Timeout 191.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.709654e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.714480e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.971089e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.988740e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.307548e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.677376e-01 s Time to initialize coeftab 9.471907e-01 s Start 2128: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrcpbegin 1847/3626 Test #2129: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrcpend ................***Timeout 191.12 sec Start 2129: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrcpend 1847/3626 Test #2130: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpbegin ...***Timeout 191.15 sec Start 2130: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpbegin 1847/3626 Test #2131: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpend .....***Timeout 191.16 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2131: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpend 1847/3626 Test #2132: mpi_dst_example_simple_lap_d_facto0_sched0_not_tqrcpbegin ...............***Timeout 191.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.946786e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.801108e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.664903e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.539719e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.747878e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.644619e-01 s Time to initialize coeftab 1.232136e+00 s Start 2132: mpi_dst_example_simple_lap_d_facto0_sched0_not_tqrcpbegin 1847/3626 Test #2133: mpi_dst_example_simple_lap_d_facto0_sched0_not_tqrcpend .................***Timeout 191.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 Start 2133: mpi_dst_example_simple_lap_d_facto0_sched0_not_tqrcpend 1847/3626 Test #2134: mpi_dst_example_simple_lap_d_facto0_sched0_kway_tqrcpbegin ..............***Timeout 191.19 sec ischedInit: The thread number has been automatically set to 256 Start 2134: mpi_dst_example_simple_lap_d_facto0_sched0_kway_tqrcpbegin Start 2350: mpi_dst_example_simple_lap_c_facto4_sched0_not_rqrcpbegin Start 2351: mpi_dst_example_simple_lap_c_facto4_sched0_not_rqrcpend Start 2352: mpi_dst_example_simple_lap_c_facto4_sched0_kway_rqrcpbegin Start 2353: mpi_dst_example_simple_lap_c_facto4_sched0_kway_rqrcpend Start 2354: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_rqrcpbegin Start 2355: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_rqrcpend Start 2356: mpi_dst_example_simple_lap_c_facto4_sched0_not_tqrcpbegin Start 2357: mpi_dst_example_simple_lap_c_facto4_sched0_not_tqrcpend Start 2358: mpi_dst_example_simple_lap_c_facto4_sched0_kway_tqrcpbegin Test #1870: c_mpi_rep_example_personal_lap_c_facto3 ................................. Passed 190.65 sec Test #1936: mpi_rep_example_simple_lap_z_facto3_sched0_1d ........................... Passed 190.16 sec Test #1858: c_mpi_rep_example_step-by-step_lap_z_facto2 ............................. Passed 190.86 sec Test #1850: c_mpi_rep_example_step-by-step_lap_d_facto2 ............................. Passed 191.71 sec Test #1921: c_mpi_rep_example_simple_mixed_lap_z_refine_bicgstab_sym ................ Passed 190.37 sec 1852/3626 Test #2170: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrrtbegin ............... Passed 185.67 sec 1853/3626 Test #2181: mpi_dst_example_simple_lap_d_facto2_sched0_kway_svdend .................. Passed 184.73 sec Test #1853: c_mpi_rep_example_step-by-step_lap_c_facto2 ............................. Passed 191.78 sec Test #1911: c_mpi_rep_example_simple_mixed_refine_gmres ............................. Passed 190.52 sec Test #1900: c_mpi_rep_example_refinement_lap_c_refine_bicgstab_her .................. Passed 190.63 sec Test #1942: mpi_rep_example_simple_lap_d_facto1_sched1_1d ........................... Passed 190.19 sec Test #1875: c_mpi_rep_example_personal_lap_z_facto3 ................................. Passed 190.81 sec Test #1949: mpi_rep_example_simple_lap_z_facto0_sched1_1d ........................... Passed 190.03 sec Test #1901: c_mpi_rep_example_refinement_lap_c_refine_cg_sym ........................ Passed 190.74 sec Test #1938: mpi_rep_example_simple_lap_s_facto0_sched1_1d ........................... Passed 190.33 sec Test #1965: mpi_rep_example_simple_lap_z_facto0_sched4_1d ........................... Passed 189.89 sec Test #1927: mpi_rep_example_simple_lap_d_facto2_sched0_1d ........................... Passed 190.52 sec Test #1915: c_mpi_rep_example_simple_mixed_lap_d_refine_bicgstab_sym ................ Passed 190.67 sec 1865/3626 Test #2237: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrrtend ................ Passed 183.86 sec Test #2012: mpi_dst_example_simple_lap_c_facto4_sched4_1d ........................... Passed 189.08 sec Test #1947: mpi_rep_example_simple_lap_c_facto3_sched1_1d ........................... Passed 190.31 sec Test #2024: mpi_dst_example_simple_lap_s_facto0_sched0_not_pqrcpbegin ............... Passed 188.99 sec Test #1473: shm_example_simple_lap_c_facto0_sched4_not_rqrcpbegin ................... Passed 191.56 sec Test #2000: mpi_dst_example_simple_lap_z_facto3_sched1_1d ........................... Passed 189.73 sec Test #1873: c_mpi_rep_example_personal_lap_z_facto1 ................................. Passed 191.12 sec Start 2359: mpi_dst_example_simple_lap_c_facto4_sched0_kway_tqrcpend Start 2360: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_tqrcpbegin Start 2361: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_tqrcpend Start 2362: mpi_dst_example_simple_lap_c_facto4_sched0_not_rqrrtbegin Start 2363: mpi_dst_example_simple_lap_c_facto4_sched0_not_rqrrtend Start 2364: mpi_dst_example_simple_lap_c_facto4_sched0_kway_rqrrtbegin Start 2365: mpi_dst_example_simple_lap_c_facto4_sched0_kway_rqrrtend Start 2366: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_rqrrtbegin Start 2367: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_rqrrtend Start 2368: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpilu0 Start 2369: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpilu1 Start 2370: mpi_dst_example_simple_lap_z_facto0_sched0_not_svdbegin Start 2371: mpi_dst_example_simple_lap_z_facto0_sched0_not_svdend Start 2372: mpi_dst_example_simple_lap_z_facto0_sched0_kway_svdbegin Start 2373: mpi_dst_example_simple_lap_z_facto0_sched0_kway_svdend Start 2374: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_svdbegin Start 2375: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_svdend Start 2376: mpi_dst_example_simple_lap_z_facto0_sched0_not_pqrcpbegin Start 2377: mpi_dst_example_simple_lap_z_facto0_sched0_not_pqrcpend Start 2378: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpbegin Start 2379: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpend Start 2380: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_pqrcpbegin Start 2381: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_pqrcpend Start 2382: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrcpbegin Start 2383: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrcpend 1872/3626 Test #2154: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpbegin .............. Passed 187.40 sec Start 2384: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrcpbegin Test #1589: shm_example_simple_lap_c_facto4_sched4_not_svdbegin ..................... Passed 190.82 sec Start 2385: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrcpend Test #1656: shm_example_simple_lap_z_facto1_sched4_kway_svdend ...................... Passed 191.26 sec Start 2386: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrcpbegin Test #1545: shm_example_simple_lap_c_facto2_sched4_kway_tqrcpbegin .................. Passed 196.61 sec Start 2387: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrcpend Test #1845: c_mpi_rep_example_step-by-step_lap_s_facto0 .............................***Timeout 388.71 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1845: c_mpi_rep_example_step-by-step_lap_s_facto0 Test #1846: c_mpi_rep_example_step-by-step_lap_s_facto1 .............................***Timeout 388.70 sec ischedInit: The thread number has been automatically set to 256 Start 1846: c_mpi_rep_example_step-by-step_lap_s_facto1 Test #1847: c_mpi_rep_example_step-by-step_lap_s_facto2 .............................***Timeout 388.69 sec ischedInit: The thread number has been automatically set to 256 Start 1847: c_mpi_rep_example_step-by-step_lap_s_facto2 Test #1849: c_mpi_rep_example_step-by-step_lap_d_facto1 .............................***Timeout 388.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.673990e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.891895e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.205717e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.150984e+00 s Time to initialize internal csc 2.290431e-03 s Time to initialize coeftab 2.430726e-01 s Time to factorize 5.583319e-01 s ( 9.37 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 9.373392e-01 s WARNING: WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 5.470102e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.212951e-16 max(|| b_i - A x_i ||_1) 1.677314e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.189645e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.110223e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.223106e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.212951e-16 max(|| b_i - A x_i ||_1) 1.677314e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.189645e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.212951e-16 max(|| b_i - A x_i ||_1) 1.677314e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.189645e-03 (SUCCESS) max(|| x0_i - x_i ||_oo) 1.110223e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.223106e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.110223e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.223106e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.212951e-16 max(|| b_i - A x_i ||_1) 1.677314e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.189645e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.110223e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.223106e-03 (SUCCESS) Time to solve 9.091478e-01 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 9.549609e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.199453e-16 max(|| b_i - A x_i ||_1) 1.682167e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.177000e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.199453e-16 Time to initialize internal csc 4.197860e-04 s max(|| b_i - A x_i ||_1) 1.682167e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.177000e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.199453e-16 max(|| b_i - A x_i ||_1) 1.682167e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.177000e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.199453e-16 max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.682167e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.177000e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) Time to initialize coeftab 1.053871e-01 s Time to factorize 2.387294e+00 s ( 2.19 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 9.616274e-01 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 1.426300e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.137963e-16 max(|| b_i - A x_i ||_1) 1.667387e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141451e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.137963e-16 max(|| b_i - A x_i ||_1) 1.667387e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141451e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.137963e-16 max(|| b_i - A x_i ||_1) 1.667387e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141451e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.137963e-16 max(|| b_i - A x_i ||_1) 1.667387e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141451e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) Time to solve 6.371034e-01 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 1.304098e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.250067e-16 max(|| b_i - A x_i ||_1) 1.684040e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.163960e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.250067e-16 || A ||_1 5.112481e-02 max(|| b_i - A x_i ||_1) 1.684040e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.163960e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.250067e-16 max(|| b_i - A x_i ||_1) 1.684040e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.163960e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.250067e-16 max(|| b_i - A x_i ||_1) 1.684040e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.163960e-03 (SUCCESS) max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 9.992007e-16 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.004526e-03 (SUCCESS) Start 1849: c_mpi_rep_example_step-by-step_lap_d_facto1 Test #1852: c_mpi_rep_example_step-by-step_lap_c_facto1 .............................***Timeout 388.66 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1852: c_mpi_rep_example_step-by-step_lap_c_facto1 Test #1854: c_mpi_rep_example_step-by-step_lap_c_facto3 .............................***Timeout 388.64 sec Start 1854: c_mpi_rep_example_step-by-step_lap_c_facto3 Test #1856: c_mpi_rep_example_step-by-step_lap_z_facto0 .............................***Timeout 388.60 sec ischedInit: The thread number has been automatically set to 256 Start 1856: c_mpi_rep_example_step-by-step_lap_z_facto0 Test #1857: c_mpi_rep_example_step-by-step_lap_z_facto1 .............................***Timeout 388.57 sec ischedInit: The thread number has been automatically set to 256 Start 1857: c_mpi_rep_example_step-by-step_lap_z_facto1 Test #1360: shm_example_simple_lap_s_facto2_sched4_kway_rqrrtend ....................***Timeout 388.56 sec Test #1361: shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtbegin .......***Timeout 388.54 sec Test #1372: shm_example_simple_lap_d_facto0_sched4_not_pqrcpend .....................***Timeout 388.51 sec Test #1374: shm_example_simple_lap_d_facto0_sched4_kway_pqrcpend ....................***Timeout 388.48 sec Test #1378: shm_example_simple_lap_d_facto0_sched4_not_rqrcpend .....................***Timeout 388.45 sec Test #1381: shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpbegin .......***Timeout 388.41 sec Test #1397: shm_example_simple_lap_d_facto1_sched4_not_svdbegin .....................***Timeout 388.36 sec Test #1427: shm_example_simple_lap_d_facto1_sched4_kway_pqrcpilu0 ...................***Timeout 388.29 sec Test #1437: shm_example_simple_lap_d_facto2_sched4_kway_pqrcpbegin ..................***Timeout 388.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Test #1453: shm_example_simple_lap_d_facto2_sched4_not_rqrrtbegin ...................***Timeout 388.24 sec Test #1455: shm_example_simple_lap_d_facto2_sched4_kway_rqrrtbegin ..................***Timeout 388.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Test #1463: shm_example_simple_lap_c_facto0_sched4_kway_svdbegin ....................***Timeout 388.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #1471: shm_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpbegin .......***Timeout 388.21 sec Test #1475: shm_example_simple_lap_c_facto0_sched4_kway_rqrcpbegin ..................***Timeout 388.19 sec Test #1476: shm_example_simple_lap_c_facto0_sched4_kway_rqrcpend ....................***Timeout 388.18 sec Test #1479: shm_example_simple_lap_c_facto0_sched4_not_tqrcpbegin ...................***Timeout 388.17 sec Test #1482: shm_example_simple_lap_c_facto0_sched4_kway_tqrcpend ....................***Timeout 388.16 sec Test #1490: shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtend .........***Timeout 388.15 sec Test #1491: shm_example_simple_lap_c_facto0_sched4_kway_pqrcpilu0 ...................***Timeout 388.13 sec Test #1495: shm_example_simple_lap_c_facto1_sched4_kway_svdbegin ....................***Timeout 388.12 sec Test #1497: shm_example_simple_lap_c_facto1_sched4_kwayprojections_svdbegin .........***Timeout 388.11 sec Test #1500: shm_example_simple_lap_c_facto1_sched4_not_pqrcpend .....................***Timeout 388.09 sec Test #1511: shm_example_simple_lap_c_facto1_sched4_not_tqrcpbegin ...................***Timeout 388.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Test #1513: shm_example_simple_lap_c_facto1_sched4_kway_tqrcpbegin ..................***Timeout 388.06 sec Test #1514: shm_example_simple_lap_c_facto1_sched4_kway_tqrcpend ....................***Timeout 388.04 sec Test #1515: shm_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpbegin .......***Timeout 388.02 sec Test #1517: shm_example_simple_lap_c_facto1_sched4_not_rqrrtbegin ...................***Timeout 388.01 sec Test #1522: shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtend .........***Timeout 387.99 sec Test #1523: shm_example_simple_lap_c_facto1_sched4_kway_pqrcpilu0 ...................***Timeout 387.97 sec Test #1526: shm_example_simple_lap_c_facto2_sched4_not_svdend .......................***Timeout 387.94 sec Test #1528: shm_example_simple_lap_c_facto2_sched4_kway_svdend ......................***Timeout 387.89 sec Test #1859: c_mpi_rep_example_step-by-step_lap_z_facto3 .............................***Timeout 387.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1859: c_mpi_rep_example_step-by-step_lap_z_facto3 Test #1860: c_mpi_rep_example_step-by-step_lap_z_facto4 .............................***Timeout 387.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1860: c_mpi_rep_example_step-by-step_lap_z_facto4 Test #1861: c_mpi_rep_example_personal_lap_s_facto0 .................................***Timeout 387.84 sec Start 1861: c_mpi_rep_example_personal_lap_s_facto0 Test #1864: c_mpi_rep_example_personal_lap_d_facto0 .................................***Timeout 387.83 sec Start 1864: c_mpi_rep_example_personal_lap_d_facto0 Test #1461: shm_example_simple_lap_c_facto0_sched4_not_svdbegin .....................***Timeout 387.82 sec Test #1465: shm_example_simple_lap_c_facto0_sched4_kwayprojections_svdbegin .........***Timeout 387.81 sec Test #1525: shm_example_simple_lap_c_facto2_sched4_not_svdbegin .....................***Timeout 387.80 sec Test #1866: c_mpi_rep_example_personal_lap_d_facto2 .................................***Timeout 387.79 sec Start 1866: c_mpi_rep_example_personal_lap_d_facto2 Test #1868: c_mpi_rep_example_personal_lap_c_facto1 .................................***Timeout 387.78 sec ischedInit: The thread number has been automatically set to 256 Start 1868: c_mpi_rep_example_personal_lap_c_facto1 Test #1869: c_mpi_rep_example_personal_lap_c_facto2 .................................***Timeout 387.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1869: c_mpi_rep_example_personal_lap_c_facto2 Test #1872: c_mpi_rep_example_personal_lap_z_facto0 .................................***Timeout 387.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1872: c_mpi_rep_example_personal_lap_z_facto0 Test #1874: c_mpi_rep_example_personal_lap_z_facto2 .................................***Timeout 387.74 sec Start 1874: c_mpi_rep_example_personal_lap_z_facto2 Test #1876: c_mpi_rep_example_personal_lap_z_facto4 .................................***Timeout 387.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1876: c_mpi_rep_example_personal_lap_z_facto4 Test #1878: c_mpi_rep_example_simple_scotch_mm ......................................***Timeout 387.71 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1878: c_mpi_rep_example_simple_scotch_mm Test #1880: c_mpi_rep_example_simple_scotch_mm2 .....................................***Timeout 387.71 sec ischedInit: The thread number has been automatically set to 256 Start 1880: c_mpi_rep_example_simple_scotch_mm2 Test #1881: c_mpi_rep_example_simple_single_rsa .....................................***Timeout 387.71 sec RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver Start 1881: c_mpi_rep_example_simple_single_rsa Test #1533: shm_example_simple_lap_c_facto2_sched4_kway_pqrcpbegin ..................***Timeout 387.71 sec Test #1885: c_mpi_rep_example_step-by-step_single_rsa ...............................***Timeout 387.70 sec RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.789380e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 1607873 Fill-in of L 39.664331 Time to compute symbol matrix 3.078153e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.223937e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 1885: c_mpi_rep_example_step-by-step_single_rsa Test #1887: c_mpi_rep_example_step-by-step_single_hb ................................***Timeout 387.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1887: c_mpi_rep_example_step-by-step_single_hb Test #1892: c_mpi_rep_example_refinement_lap_s_refine_cg_sym ........................***Timeout 387.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1892: c_mpi_rep_example_refinement_lap_s_refine_cg_sym Test #1894: c_mpi_rep_example_refinement_lap_s_refine_bicgstab_sym ..................***Timeout 387.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.985568e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.695885e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.947489e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.782976e-01 s Time to initialize internal csc 2.562071e-03 s - iteration 1 : total iteration time 4.1 s error 0.07074 - iteration 2 : total iteration time 3.18 s error 0.011538 - iteration 3 : total iteration time 7.08 s error 0.0023186 - iteration 4 : total iteration time 1.19 s error 0.0004819 Start 1894: c_mpi_rep_example_refinement_lap_s_refine_bicgstab_sym Test #1896: c_mpi_rep_example_refinement_lap_d_refine_gmres_sym .....................***Timeout 387.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1896: c_mpi_rep_example_refinement_lap_d_refine_gmres_sym Test #1904: c_mpi_rep_example_refinement_lap_z_refine_cg_her ........................***Timeout 387.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1904: c_mpi_rep_example_refinement_lap_z_refine_cg_her Test #1905: c_mpi_rep_example_refinement_lap_z_refine_gmres_her .....................***Timeout 387.64 sec Start 1905: c_mpi_rep_example_refinement_lap_z_refine_gmres_her Test #1906: c_mpi_rep_example_refinement_lap_z_refine_bicgstab_her ..................***Timeout 387.63 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1906: c_mpi_rep_example_refinement_lap_z_refine_bicgstab_her Test #1908: c_mpi_rep_example_refinement_lap_z_refine_gmres_sym .....................***Timeout 387.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.731926e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.622839e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.937527e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.921992e+00 s Time to initialize internal csc 9.325948e-03 s - iteration 1 : total iteration time 0.499 s error 0.20013 - iteration 2 : total iteration time 0.812 s error 0.056488 - iteration 3 : total iteration time 1.22 s error 0.017842 - iteration 4 : total iteration time 3.33 s error 0.0060829 - iteration 5 : total iteration time 0.878 s error 0.0021257 - iteration 6 : total iteration time 1.13 s error 0.00075052 - iteration 7 : total iteration time 0.871 s error 0.00026229 - iteration 8 : total iteration time 7.25 s error 8.7579e-05 - iteration 9 : total iteration time 2.28 s error 2.9067e-05 - iteration 10 : total iteration time 1.45 s error 9.6345e-06 - iteration 11 : total iteration time 2.37 s error 2.9776e-06 - iteration 12 : total iteration time 2.05 s error 8.9895e-07 - iteration 13 : total iteration time 1.85 s error 2.6946e-07 - iteration 14 : total iteration time 4.46 s error 7.9558e-08 - iteration 15 : total iteration time 2.74 s error 2.3189e-08 - iteration 16 : total iteration time 13 s error 6.8098e-09 - iteration 17 : total iteration time 2.75 s error 1.9121e-09 Start 1908: c_mpi_rep_example_refinement_lap_z_refine_gmres_sym Test #1909: c_mpi_rep_example_refinement_lap_z_refine_bicgstab_sym ..................***Timeout 387.61 sec ischedInit: The thread number has been automatically set to 256 Start 1909: c_mpi_rep_example_refinement_lap_z_refine_bicgstab_sym Test #1910: c_mpi_rep_example_simple_mixed_refine_cg ................................***Timeout 387.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1910: c_mpi_rep_example_simple_mixed_refine_cg Test #1913: c_mpi_rep_example_simple_mixed_lap_d_refine_cg_sym ......................***Timeout 387.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1913: c_mpi_rep_example_simple_mixed_lap_d_refine_cg_sym Test #1914: c_mpi_rep_example_simple_mixed_lap_d_refine_gmres_sym ...................***Timeout 387.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1914: c_mpi_rep_example_simple_mixed_lap_d_refine_gmres_sym Test #1916: c_mpi_rep_example_simple_mixed_lap_z_refine_cg_her ......................***Timeout 387.55 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1916: c_mpi_rep_example_simple_mixed_lap_z_refine_cg_her Test #1917: c_mpi_rep_example_simple_mixed_lap_z_refine_gmres_her ...................***Timeout 387.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1917: c_mpi_rep_example_simple_mixed_lap_z_refine_gmres_her Test #1918: c_mpi_rep_example_simple_mixed_lap_z_refine_bicgstab_her ................***Timeout 387.53 sec Start 1918: c_mpi_rep_example_simple_mixed_lap_z_refine_bicgstab_her Test #1923: mpi_rep_example_simple_lap_s_facto1_sched0_1d ...........................***Timeout 387.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1923: mpi_rep_example_simple_lap_s_facto1_sched0_1d Test #1925: mpi_rep_example_simple_lap_d_facto0_sched0_1d ...........................***Timeout 387.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1925: mpi_rep_example_simple_lap_d_facto0_sched0_1d Test #1926: mpi_rep_example_simple_lap_d_facto1_sched0_1d ...........................***Timeout 387.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1926: mpi_rep_example_simple_lap_d_facto1_sched0_1d Test #1928: mpi_rep_example_simple_lap_c_facto0_sched0_1d ...........................***Timeout 387.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1928: mpi_rep_example_simple_lap_c_facto0_sched0_1d Test #1929: mpi_rep_example_simple_lap_c_facto1_sched0_1d ...........................***Timeout 387.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1929: mpi_rep_example_simple_lap_c_facto1_sched0_1d Test #1932: mpi_rep_example_simple_lap_c_facto4_sched0_1d ...........................***Timeout 387.41 sec ischedInit: The thread number has been automatically set to 256 Start 1932: mpi_rep_example_simple_lap_c_facto4_sched0_1d Test #1940: mpi_rep_example_simple_lap_s_facto2_sched1_1d ...........................***Timeout 387.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1940: mpi_rep_example_simple_lap_s_facto2_sched1_1d Test #1941: mpi_rep_example_simple_lap_d_facto0_sched1_1d ...........................***Timeout 387.33 sec Start 1941: mpi_rep_example_simple_lap_d_facto0_sched1_1d Test #1945: mpi_rep_example_simple_lap_c_facto1_sched1_1d ...........................***Timeout 387.29 sec Start 1945: mpi_rep_example_simple_lap_c_facto1_sched1_1d Test #1946: mpi_rep_example_simple_lap_c_facto2_sched1_1d ...........................***Timeout 387.27 sec Start 1946: mpi_rep_example_simple_lap_c_facto2_sched1_1d Test #1951: mpi_rep_example_simple_lap_z_facto2_sched1_1d ...........................***Timeout 387.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1951: mpi_rep_example_simple_lap_z_facto2_sched1_1d Test #1953: mpi_rep_example_simple_lap_z_facto4_sched1_1d ...........................***Timeout 387.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.208151e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.579996e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.555974e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.039979e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.342622e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.375944e-02 s Time to initialize coeftab 6.709056e-02 s Time to factorize 6.362743e-01 s (33.49 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 7.020455e-01 s Time for refinement 2.269239e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.071127e-16 max(|| b_i - A x_i ||_1) 1.701909e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.294494e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.071127e-16 max(|| b_i - A x_i ||_1) 1.701909e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.294494e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.071127e-16 max(|| b_i - A x_i ||_1) 1.701909e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.294494e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.071127e-16 max(|| b_i - A x_i ||_1) 1.701909e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.294494e-03 (SUCCESS) Start 1953: mpi_rep_example_simple_lap_z_facto4_sched1_1d Test #1954: mpi_rep_example_simple_lap_s_facto0_sched4_1d ...........................***Timeout 387.07 sec ischedInit: The thread number has been automatically set to 256 Start 1954: mpi_rep_example_simple_lap_s_facto0_sched4_1d Test #1955: mpi_rep_example_simple_lap_s_facto1_sched4_1d ...........................***Timeout 387.07 sec ischedInit: The thread number has been automatically set to 256 Start 1955: mpi_rep_example_simple_lap_s_facto1_sched4_1d Test #1958: mpi_rep_example_simple_lap_d_facto1_sched4_1d ...........................***Timeout 387.05 sec Start 1958: mpi_rep_example_simple_lap_d_facto1_sched4_1d Test #1960: mpi_rep_example_simple_lap_c_facto0_sched4_1d ...........................***Timeout 387.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1960: mpi_rep_example_simple_lap_c_facto0_sched4_1d Test #1549: shm_example_simple_lap_c_facto2_sched4_not_rqrrtbegin ...................***Timeout 387.03 sec Test #1962: mpi_rep_example_simple_lap_c_facto2_sched4_1d ...........................***Timeout 387.02 sec ischedInit: The thread number has been automatically set to 256 Start 1962: mpi_rep_example_simple_lap_c_facto2_sched4_1d Test #1963: mpi_rep_example_simple_lap_c_facto3_sched4_1d ...........................***Timeout 387.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 1963: mpi_rep_example_simple_lap_c_facto3_sched4_1d Test #1966: mpi_rep_example_simple_lap_z_facto1_sched4_1d ...........................***Timeout 387.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1966: mpi_rep_example_simple_lap_z_facto1_sched4_1d Test #1967: mpi_rep_example_simple_lap_z_facto2_sched4_1d ...........................***Timeout 386.99 sec Start 1967: mpi_rep_example_simple_lap_z_facto2_sched4_1d Test #1968: mpi_rep_example_simple_lap_z_facto3_sched4_1d ...........................***Timeout 386.98 sec ischedInit: The thread number has been automatically set to 256 Start 1968: mpi_rep_example_simple_lap_z_facto3_sched4_1d Test #1969: mpi_rep_example_simple_lap_z_facto4_sched4_1d ...........................***Timeout 386.98 sec Start 1969: mpi_rep_example_simple_lap_z_facto4_sched4_1d Test #1970: mpi_dst_example_simple_lap_s_facto0_sched0_1d ...........................***Timeout 386.97 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1970: mpi_dst_example_simple_lap_s_facto0_sched0_1d Test #1972: mpi_dst_example_simple_lap_s_facto2_sched0_1d ...........................***Timeout 386.96 sec ischedInit: The thread number has been automatically set to 256 Start 1972: mpi_dst_example_simple_lap_s_facto2_sched0_1d Test #1973: mpi_dst_example_simple_lap_d_facto0_sched0_1d ...........................***Timeout 386.93 sec ischedInit: The thread number has been automatically set to 256 Start 1973: mpi_dst_example_simple_lap_d_facto0_sched0_1d Test #1977: mpi_dst_example_simple_lap_c_facto1_sched0_1d ...........................***Timeout 386.91 sec Start 1977: mpi_dst_example_simple_lap_c_facto1_sched0_1d Test #1978: mpi_dst_example_simple_lap_c_facto2_sched0_1d ...........................***Timeout 386.90 sec Start 1978: mpi_dst_example_simple_lap_c_facto2_sched0_1d Test #1984: mpi_dst_example_simple_lap_z_facto3_sched0_1d ...........................***Timeout 386.90 sec ischedInit: The thread number has been automatically set to 256 Start 1984: mpi_dst_example_simple_lap_z_facto3_sched0_1d Test #1985: mpi_dst_example_simple_lap_z_facto4_sched0_1d ...........................***Timeout 386.89 sec Start 1985: mpi_dst_example_simple_lap_z_facto4_sched0_1d Test #1987: mpi_dst_example_simple_lap_s_facto1_sched1_1d ...........................***Timeout 386.88 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 1987: mpi_dst_example_simple_lap_s_facto1_sched1_1d Test #1988: mpi_dst_example_simple_lap_s_facto2_sched1_1d ...........................***Timeout 386.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1988: mpi_dst_example_simple_lap_s_facto2_sched1_1d Test #1989: mpi_dst_example_simple_lap_d_facto0_sched1_1d ...........................***Timeout 386.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 1989: mpi_dst_example_simple_lap_d_facto0_sched1_1d Test #1990: mpi_dst_example_simple_lap_d_facto1_sched1_1d ...........................***Timeout 386.86 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.443441e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.837788e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.367213e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.246497e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.641847e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.940953e-01 s Time to initialize coeftab 1.299423e+00 s Time to factorize 7.731269e-01 s ( 6.77 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Start 1990: mpi_dst_example_simple_lap_d_facto1_sched1_1d Test #1994: mpi_dst_example_simple_lap_c_facto2_sched1_1d ...........................***Timeout 386.85 sec ischedInit: The thread number has been automatically set to 256 Start 1994: mpi_dst_example_simple_lap_c_facto2_sched1_1d Test #1580: shm_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpend .........***Timeout 386.84 sec Test #1995: mpi_dst_example_simple_lap_c_facto3_sched1_1d ...........................***Timeout 386.81 sec Start 1995: mpi_dst_example_simple_lap_c_facto3_sched1_1d Test #1996: mpi_dst_example_simple_lap_c_facto4_sched1_1d ...........................***Timeout 386.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1996: mpi_dst_example_simple_lap_c_facto4_sched1_1d Test #1997: mpi_dst_example_simple_lap_z_facto0_sched1_1d ...........................***Timeout 386.80 sec ischedInit: The thread number has been automatically set to 256 Start 1997: mpi_dst_example_simple_lap_z_facto0_sched1_1d Test #1998: mpi_dst_example_simple_lap_z_facto1_sched1_1d ...........................***Timeout 386.79 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 1998: mpi_dst_example_simple_lap_z_facto1_sched1_1d Test #1999: mpi_dst_example_simple_lap_z_facto2_sched1_1d ...........................***Timeout 386.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.137831e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.439644e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.000417e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.671302e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.238672e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.945844e-01 s Time to initialize coeftab 2.422659e-01 s Time to factorize 6.794666e-01 s (58.83 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Memory usage of coeftab 548 Ko Start 1999: mpi_dst_example_simple_lap_z_facto2_sched1_1d Test #1582: shm_example_simple_lap_c_facto3_sched4_not_rqrrtend .....................***Timeout 386.76 sec Test #2001: mpi_dst_example_simple_lap_z_facto4_sched1_1d ...........................***Timeout 386.75 sec Start 2001: mpi_dst_example_simple_lap_z_facto4_sched1_1d Test #2002: mpi_dst_example_simple_lap_s_facto0_sched4_1d ...........................***Timeout 386.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Start 2002: mpi_dst_example_simple_lap_s_facto0_sched4_1d Test #1585: shm_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtbegin .......***Timeout 386.72 sec Test #1594: shm_example_simple_lap_c_facto4_sched4_kwayprojections_svdend ...........***Timeout 386.54 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.912616e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.630182e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.670789e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.658974e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.304859e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 9.324079e-03 s Time to initialize coeftab 4.462755e-01 s Time to factorize 3.882711e+00 s ( 5.49 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 3.52 Ko Outside 4.22 Ko Low-rank supernodes Diag in diag 248 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 382 Ko / 382 Ko ------------------------------------------------ Total 638 Ko / 638 Ko Time to solve 1.155105e+00 s Time for refinement 1.520217e+01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.080485e-07 max(|| b_i - A x_i ||_1) 8.879446e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.240550e+00 (SUCCESS) Test #2005: mpi_dst_example_simple_lap_d_facto0_sched4_1d ...........................***Timeout 386.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.993984e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.687098e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.751085e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.980020e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.159506e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.076909e-01 s Time to initialize coeftab 6.404060e-01 s Time to factorize 7.019950e+00 s (738.44 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 4.306216e+00 s Time for refinement 3.646987e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.133329e-16 max(|| b_i - A x_i ||_1) 1.795219e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.255846e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.133329e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.133329e-16 max(|| b_i - A x_i ||_1) 1.795219e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.255846e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.795219e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.255846e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.133329e-16 max(|| b_i - A x_i ||_1) 1.795219e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.255846e-03 (SUCCESS) Start 2005: mpi_dst_example_simple_lap_d_facto0_sched4_1d Test #2007: mpi_dst_example_simple_lap_d_facto2_sched4_1d ...........................***Timeout 386.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.377716e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.209040e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.138472e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.103428e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.773372e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.965827e-01 s Time to initialize coeftab 8.234320e-01 s Time to factorize 6.492097e+00 s ( 1.54 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 6.419185e+00 s Time for refinement 1.831994e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.011301e-16 max(|| b_i - A x_i ||_1) 1.657828e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.083203e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.011301e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.011301e-16 max(|| b_i - A x_i ||_1) 1.657828e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.083203e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.011301e-16 max(|| b_i - A x_i ||_1) 1.657828e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.083203e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.657828e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.083203e-03 (SUCCESS) Start 2007: mpi_dst_example_simple_lap_d_facto2_sched4_1d Test #2008: mpi_dst_example_simple_lap_c_facto0_sched4_1d ...........................***Timeout 386.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.929189e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.387555e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.585919e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.576208e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.132697e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.341315e-02 s Time to initialize coeftab 6.340056e-01 s Time to factorize 7.460029e+00 s ( 2.72 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 3.821563e+00 s Time for refinement 4.009526e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.946372e-07 max(|| b_i - A x_i ||_1) 8.758135e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.209982e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.946372e-07 max(|| b_i - A x_i ||_1) 8.758135e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.209982e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.946372e-07 max(|| b_i - A x_i ||_1) 8.758135e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.209982e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.946372e-07 max(|| b_i - A x_i ||_1) 8.758135e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.209982e+00 (SUCCESS) Start 2008: mpi_dst_example_simple_lap_c_facto0_sched4_1d Test #2009: mpi_dst_example_simple_lap_c_facto1_sched4_1d ...........................***Timeout 386.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.891076e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.693645e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.947730e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.641363e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.430120e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.630273e-01 s Time to initialize coeftab 5.071711e-01 s Time to factorize 5.037296e+00 s ( 4.23 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 5.250667e+00 s Time for refinement 1.658315e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.814018e-07 max(|| b_i - A x_i ||_1) 7.948400e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.005658e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.814018e-07 max(|| b_i - A x_i ||_1) 7.948400e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.005658e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.814018e-07 max(|| b_i - A x_i ||_1) 7.948400e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.005658e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.814018e-07 max(|| b_i - A x_i ||_1) 7.948400e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.005658e+00 (SUCCESS) Start 2009: mpi_dst_example_simple_lap_c_facto1_sched4_1d Test #2010: mpi_dst_example_simple_lap_c_facto2_sched4_1d ...........................***Timeout 386.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.427010e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.421676e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.091525e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.048752e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.032689e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.726398e-01 s Time to initialize coeftab 1.380926e+00 s Time to factorize 3.734905e+00 s (10.70 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 9.298936e-01 s Time for refinement 4.842602e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.764615e-07 max(|| b_i - A x_i ||_1) 7.635721e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.926758e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.764615e-07 max(|| b_i - A x_i ||_1) 7.635721e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.926758e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.764615e-07 max(|| b_i - A x_i ||_1) 7.635721e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.926758e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.764615e-07 max(|| b_i - A x_i ||_1) 7.635721e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.926758e+00 (SUCCESS) Start 2010: mpi_dst_example_simple_lap_c_facto2_sched4_1d Test #2014: mpi_dst_example_simple_lap_z_facto1_sched4_1d ...........................***Timeout 386.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.417401e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.611367e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.353843e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.877896e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.561362e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.406080e-01 s Time to initialize coeftab 5.725911e-01 s Time to factorize 5.107693e+00 s ( 4.17 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 6.355199e+00 s Time for refinement 3.080534e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.096802e-16 max(|| b_i - A x_i ||_1) 1.755683e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.430183e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.096802e-16 max(|| b_i - A x_i ||_1) 1.755683e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.430183e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.096802e-16 max(|| b_i - A x_i ||_1) 1.755683e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.430183e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.096802e-16 max(|| b_i - A x_i ||_1) 1.755683e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.430183e-03 (SUCCESS) Start 2014: mpi_dst_example_simple_lap_z_facto1_sched4_1d Test #2019: mpi_dst_example_simple_lap_s_facto0_sched0_not_svdend ...................***Timeout 386.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 3: 200 660 2: 200 760 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.620944e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.331773e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.145518e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.089483e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.774351e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.818310e-01 s Time to initialize coeftab 3.358326e+00 s Time to factorize 4.184985e+00 s ( 1.21 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 4.930111e-01 s Time for refinement 6.020035e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.913705e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.913705e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.913705e-07 max(|| b_i - A x_i ||_1) 8.598442e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.080472e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.913705e-07 max(|| b_i - A x_i ||_1) 8.598442e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.080472e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.598442e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.080472e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.598442e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.080472e+00 (SUCCESS) Start 2019: mpi_dst_example_simple_lap_s_facto0_sched0_not_svdend Test #2020: mpi_dst_example_simple_lap_s_facto0_sched0_kway_svdbegin ................***Timeout 386.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.269218e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.734226e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.543597e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.562552e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.802740e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.506151e-01 s Time to initialize coeftab 6.030071e-01 s Time to factorize 1.264980e+01 s (409.79 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.257008e-03 s Time for refinement 5.555609e-03 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.048760e-07 max(|| b_i - A x_i ||_1) 9.459670e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.188693e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.048760e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.048760e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.048760e-07 max(|| b_i - A x_i ||_1) 9.459670e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.188693e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.459670e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.188693e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.459670e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.188693e+00 (SUCCESS) Start 2020: mpi_dst_example_simple_lap_s_facto0_sched0_kway_svdbegin Test #1636: shm_example_simple_lap_z_facto0_sched4_kway_rqrcpend ....................***Timeout 386.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.233061e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.202091e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.221082e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.254204e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.391930e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.858877e-01 s Time to initialize coeftab 6.532611e-02 s Time to factorize 4.986843e+00 s ( 4.07 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 3.307670e+00 s Time for refinement 1.725257e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.930100e-16 max(|| b_i - A x_i ||_1) 1.996834e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.038690e-03 (SUCCESS) Test #2022: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_svdbegin .....***Timeout 386.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.860382e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.245085e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.709655e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.681866e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.160284e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.402224e+00 s Time to initialize coeftab 5.203403e-01 s Time to factorize 6.483922e+00 s (799.48 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.346028e-01 s Time for refinement 2.772339e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.046939e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.046939e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.046939e-07 max(|| b_i - A x_i ||_1) 9.445343e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.186893e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.046939e-07 max(|| b_i - A x_i ||_1) 9.445343e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.186893e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.445343e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.186893e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.445343e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.186893e+00 (SUCCESS) Start 2022: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_svdbegin Test #2025: mpi_dst_example_simple_lap_s_facto0_sched0_not_pqrcpend .................***Timeout 386.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.366210e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.920731e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.698413e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.763901e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.569868e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.640547e-01 s Time to initialize coeftab 1.859460e-01 s Time to factorize 2.206211e+00 s ( 2.29 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.104173e-01 s Time for refinement 2.268404e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098116e-07 max(|| b_i - A x_i ||_1) 9.590242e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205101e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098116e-07 max(|| b_i - A x_i ||_1) 9.590242e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205101e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098116e-07 max(|| b_i - A x_i ||_1) 9.590242e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205101e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098116e-07 max(|| b_i - A x_i ||_1) 9.590242e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205101e+00 (SUCCESS) Start 2025: mpi_dst_example_simple_lap_s_facto0_sched0_not_pqrcpend Test #2026: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpbegin ..............***Timeout 386.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.170078e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.053319e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.818606e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.080054e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.330544e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.752095e-03 s Time to initialize coeftab 4.334556e-01 s Time to factorize 5.032467e+00 s ( 1.01 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.358621e-03 s - iteration 1 : total iteration time 0.00243 s error 5.5354e-11 Time for refinement 8.616991e-03 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022804e-08 max(|| b_i - A x_i ||_1) 2.923980e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674245e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022804e-08 max(|| b_i - A x_i ||_1) 2.923980e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674245e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022804e-08 max(|| b_i - A x_i ||_1) 2.923980e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674245e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022804e-08 max(|| b_i - A x_i ||_1) 2.923980e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674245e-01 (SUCCESS) Start 2026: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpbegin Test #2027: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpend ................***Timeout 386.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.264168e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.266748e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.312020e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.072420e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.391470e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.529449e-01 s Time to initialize coeftab 6.059707e-01 s Time to factorize 1.872804e+00 s ( 2.70 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.683571e-01 s Time for refinement 1.954762e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.096762e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.096762e-07 max(|| b_i - A x_i ||_1) 9.580319e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.203854e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.580319e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.203854e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.096762e-07 max(|| b_i - A x_i ||_1) 9.580319e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.203854e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.096762e-07 max(|| b_i - A x_i ||_1) 9.580319e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.203854e+00 (SUCCESS) Start 2027: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpend Test #1646: shm_example_simple_lap_z_facto0_sched4_not_rqrrtend .....................***Timeout 386.20 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.189290e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.933649e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.525967e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 4.303185e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.274094e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.385747e-01 s Time to initialize coeftab 4.274564e-01 s Time to factorize 1.396347e+01 s ( 1.45 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 9.357387e+00 s Time for refinement 7.197380e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.024311e-16 max(|| b_i - A x_i ||_1) 2.043102e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.155438e-03 (SUCCESS) Test #2028: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_pqrcpbegin ...***Timeout 386.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.388365e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.407387e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.514691e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.545504e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.590857e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.680895e-01 s Time to initialize coeftab 3.658684e-01 s Time to factorize 4.128506e+00 s ( 1.23 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.246081e-01 s - iteration 1 : total iteration time 0.185 s error 5.5354e-11 Time for refinement 5.264927e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022804e-08 max(|| b_i - A x_i ||_1) 2.923980e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674245e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022804e-08 max(|| b_i - A x_i ||_1) 2.923980e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674245e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022804e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.022804e-08 max(|| b_i - A x_i ||_1) 2.923980e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674245e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.923980e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.674245e-01 (SUCCESS) Start 2028: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_pqrcpbegin Test #2030: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrcpbegin ...............***Timeout 386.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.562765e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.323280e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.394962e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.165840e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.817025e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.033056e+00 s Time to initialize coeftab 1.073885e+00 s Time to factorize 2.182979e+01 s (237.46 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 2.012765e-01 s - iteration 1 : total iteration time 0.323 s error 5.5277e-11 Time for refinement 7.162701e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) Start 2030: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrcpbegin Test #2032: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrcpbegin ..............***Timeout 386.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.778008e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.371466e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.801806e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.627735e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.571999e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.773985e-01 s Time to initialize coeftab 7.428989e-01 s Time to factorize 8.602345e+00 s (602.60 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 1.796398e-01 s - iteration 1 : total iteration time 1.05 s error 5.5277e-11 Time for refinement 1.955188e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) Start 2032: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrcpbegin Test #2033: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrcpend ................***Timeout 386.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.941532e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.036310e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.816250e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.350473e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.635644e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.076666e+00 s Time to initialize coeftab 1.969444e-01 s Time to factorize 3.845497e+00 s ( 1.32 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.145884e-01 s Time for refinement 3.659686e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106760e-07 max(|| b_i - A x_i ||_1) 9.603521e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206769e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106760e-07 max(|| b_i - A x_i ||_1) 9.603521e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206769e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106760e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106760e-07 max(|| b_i - A x_i ||_1) 9.603521e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206769e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.603521e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206769e+00 (SUCCESS) Start 2033: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrcpend Test #2034: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpbegin ...***Timeout 386.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.361396e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.004777e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.504322e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.937257e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.892768e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.969889e-01 s Time to initialize coeftab 2.460937e+00 s Time to factorize 9.209296e+00 s (562.89 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 3.550251e-02 s - iteration 1 : total iteration time 0.0904 s error 5.5277e-11 Time for refinement 2.300319e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.998336e-08 max(|| b_i - A x_i ||_1) 2.896496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639709e-01 (SUCCESS) Start 2034: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpbegin Test #2035: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpend .....***Timeout 386.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.111799e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.157419e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.081345e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.057147e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.715654e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.042192e-01 s Time to initialize coeftab 2.488999e+00 s Time to factorize 3.075011e+00 s ( 1.65 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.485600e-01 s Time for refinement 1.441740e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106819e-07 max(|| b_i - A x_i ||_1) 9.609415e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.207510e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106819e-07 max(|| b_i - A x_i ||_1) 9.609415e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.207510e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106819e-07 max(|| b_i - A x_i ||_1) 9.609415e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.207510e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.106819e-07 max(|| b_i - A x_i ||_1) 9.609415e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.207510e+00 (SUCCESS) Start 2035: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpend Test #2036: mpi_dst_example_simple_lap_s_facto0_sched0_not_tqrcpbegin ...............***Timeout 386.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.992901e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.746117e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.551290e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.210522e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.322770e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.397366e-01 s Time to initialize coeftab 6.135978e-01 s Time to factorize 6.640685e+00 s (780.61 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 1.919337e-01 s - iteration 1 : total iteration time 0.12 s error 5.556e-11 Time for refinement 3.785490e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.053065e-08 max(|| b_i - A x_i ||_1) 2.921346e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670936e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.053065e-08 max(|| b_i - A x_i ||_1) 2.921346e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670936e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.053065e-08 max(|| b_i - A x_i ||_1) 2.921346e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670936e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.053065e-08 max(|| b_i - A x_i ||_1) 2.921346e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670936e-01 (SUCCESS) Start 2036: mpi_dst_example_simple_lap_s_facto0_sched0_not_tqrcpbegin Test #1657: shm_example_simple_lap_z_facto1_sched4_kwayprojections_svdbegin .........***Timeout 386.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.418343e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.311383e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.349596e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.215154e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.124958e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.440285e-02 s Time to initialize coeftab 7.639394e-01 s Time to factorize 3.511133e+00 s ( 6.07 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 1.797690e+00 s - iteration 1 : total iteration time 5.51 s error 1.6709e-14 Time for refinement 2.573973e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.670956e-14 max(|| b_i - A x_i ||_1) 2.705802e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.827657e-02 (SUCCESS) Test #1658: shm_example_simple_lap_z_facto1_sched4_kwayprojections_svdend ...........***Timeout 386.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.693132e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.547226e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.174317e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.261007e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.088387e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.828512e-02 s Time to initialize coeftab 3.875209e-01 s Time to factorize 3.977884e+00 s ( 5.36 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 5.389594e+00 s Time for refinement 1.576587e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.814739e-16 max(|| b_i - A x_i ||_1) 1.877047e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.736427e-03 (SUCCESS) Test #1659: shm_example_simple_lap_z_facto1_sched4_not_pqrcpbegin ...................***Timeout 386.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.248540e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.442660e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.718377e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.746189e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.081685e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.791444e-02 s Time to initialize coeftab 8.466002e-01 s Time to factorize 8.576621e+00 s ( 2.48 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 4.876175e+00 s - iteration 1 : total iteration time 6.16 s error 1.07e-14 Time for refinement 1.475543e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.069928e-14 max(|| b_i - A x_i ||_1) 1.808335e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.563043e-02 (SUCCESS) Test #1660: shm_example_simple_lap_z_facto1_sched4_not_pqrcpend .....................***Timeout 386.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.721216e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.751412e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.253663e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.528000e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.900103e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.705897e-01 s Time to initialize coeftab 1.834179e-01 s Time to factorize 9.143271e+00 s ( 2.33 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 4.127492e+00 s Time for refinement 1.555602e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.775138e-16 max(|| b_i - A x_i ||_1) 1.853752e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.677644e-03 (SUCCESS) Test #1661: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpbegin ..................***Timeout 385.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.736121e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.420967e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.190825e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.845176e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.765879e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.776809e-02 s Time to initialize coeftab 2.898507e-01 s Time to factorize 5.020350e+00 s ( 4.24 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 3.823549e+00 s - iteration 1 : total iteration time 23.9 s error 1.3461e-14 Time for refinement 2.897041e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.345995e-14 max(|| b_i - A x_i ||_1) 2.360793e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.957082e-02 (SUCCESS) Test #2038: mpi_dst_example_simple_lap_s_facto0_sched0_kway_tqrcpbegin ..............***Timeout 385.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.837027e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.387160e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.886427e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.807512e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.356426e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.292250e-01 s Time to initialize coeftab 6.684735e+00 s Time to factorize 9.193989e+00 s (563.82 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 9.109494e-02 s - iteration 1 : total iteration time 0.134 s error 5.556e-11 Time for refinement 3.890684e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.053065e-08 max(|| b_i - A x_i ||_1) 2.921346e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670936e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.053065e-08 max(|| b_i - A x_i ||_1) 2.921346e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670936e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.053065e-08 max(|| b_i - A x_i ||_1) 2.921346e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670936e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.053065e-08 max(|| b_i - A x_i ||_1) 2.921346e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670936e-01 (SUCCESS) Start 2038: mpi_dst_example_simple_lap_s_facto0_sched0_kway_tqrcpbegin Test #2039: mpi_dst_example_simple_lap_s_facto0_sched0_kway_tqrcpend ................***Timeout 385.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.232693e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.256970e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.311414e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.093246e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.436264e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.631097e-01 s Time to initialize coeftab 3.261102e-01 s Time to factorize 1.073819e+01 s (482.74 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.466798e-01 s Time for refinement 4.151360e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110048e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110048e-07 max(|| b_i - A x_i ||_1) 9.592181e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205344e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.592181e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205344e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110048e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110048e-07 max(|| b_i - A x_i ||_1) 9.592181e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205344e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.592181e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205344e+00 (SUCCESS) Start 2039: mpi_dst_example_simple_lap_s_facto0_sched0_kway_tqrcpend Test #2040: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpbegin ...***Timeout 385.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.031954e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.296311e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.188750e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.500168e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.269656e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.536040e-01 s Time to initialize coeftab 7.407770e-01 s Time to factorize 1.404007e+01 s (369.21 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 1.907849e-01 s - iteration 1 : total iteration time 0.166 s error 5.5597e-11 Time for refinement 5.549939e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.037595e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.037595e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.037595e-08 max(|| b_i - A x_i ||_1) 2.915165e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.663169e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.915165e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.663169e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.037595e-08 max(|| b_i - A x_i ||_1) 2.915165e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.663169e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.915165e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.663169e-01 (SUCCESS) Start 2040: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpbegin Test #2041: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpend .....***Timeout 385.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.909076e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.262265e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.107421e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.137332e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.112157e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.605351e-01 s Time to initialize coeftab 1.373267e+00 s Time to factorize 1.416935e+00 s ( 3.57 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.372197e-02 s Time for refinement 1.780738e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110107e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110107e-07 max(|| b_i - A x_i ||_1) 9.598076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206085e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110107e-07 max(|| b_i - A x_i ||_1) 9.598076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206085e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.110107e-07 max(|| b_i - A x_i ||_1) 9.598076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206085e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.598076e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206085e+00 (SUCCESS) Start 2041: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpend Test #2043: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrrtend .................***Timeout 385.65 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.323180e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.568062e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.352470e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.012729e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.438158e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.682898e-01 s Time to initialize coeftab 6.389626e-01 s Time to factorize 2.390600e+00 s ( 2.12 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.686442e-01 s Time for refinement 2.451978e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041317e-07 max(|| b_i - A x_i ||_1) 9.560967e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201422e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041317e-07 max(|| b_i - A x_i ||_1) 9.560967e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201422e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041317e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041317e-07 max(|| b_i - A x_i ||_1) 9.560967e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201422e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.560967e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201422e+00 (SUCCESS) Start 2043: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrrtend Test #2046: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtbegin ...***Timeout 385.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.400346e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.968903e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.573698e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.607982e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.724527e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.648762e-01 s Time to initialize coeftab 6.492465e-01 s Time to factorize 3.237361e+00 s ( 1.56 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.503678e-02 s - iteration 1 : total iteration time 0.12 s error 5.5928e-11 Time for refinement 3.293012e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) Start 2046: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtbegin Test #2047: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtend .....***Timeout 385.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.042693e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.057570e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.085478e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.348879e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.617534e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.877157e-01 s Time to initialize coeftab 2.934103e+00 s Time to factorize 4.486505e+00 s ( 1.13 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.290257e-01 s Time for refinement 2.641906e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041468e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041468e-07 max(|| b_i - A x_i ||_1) 9.564177e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201825e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041468e-07 max(|| b_i - A x_i ||_1) 9.564177e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201825e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.564177e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201825e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041468e-07 max(|| b_i - A x_i ||_1) 9.564177e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201825e+00 (SUCCESS) Start 2047: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtend Test #2048: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpilu0 ...............***Timeout 385.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.115365e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.748244e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.581578e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.369422e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.419334e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.440475e-01 s Time to initialize coeftab 5.116664e-01 s Time to factorize 9.219335e+00 s (562.27 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.633659e-02 s - iteration 1 : total iteration time 0.141 s error 3.1015e-11 Time for refinement 4.243107e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.061699e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.061699e-08 max(|| b_i - A x_i ||_1) 2.932194e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.684566e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.061699e-08 max(|| b_i - A x_i ||_1) 2.932194e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.684566e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.061699e-08 max(|| b_i - A x_i ||_1) 2.932194e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.684566e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.932194e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.684566e-01 (SUCCESS) Start 2048: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpilu0 Test #2049: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpilu1 ...............***Timeout 385.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.610658e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.510257e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.609052e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.499885e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.721439e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.129909e-02 s Time to initialize coeftab 5.119938e-01 s Time to factorize 3.251128e-01 s (15.57 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 4.891304e-03 s - iteration 1 : total iteration time 0.00238 s error 3.0929e-11 Time for refinement 8.291287e-03 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.061699e-08 max(|| b_i - A x_i ||_1) 2.932194e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.684566e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.061699e-08 max(|| b_i - A x_i ||_1) 2.932194e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.684566e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.061699e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.061699e-08 max(|| b_i - A x_i ||_1) 2.932194e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.684566e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.932194e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.684566e-01 (SUCCESS) Start 2049: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpilu1 Test #2050: mpi_dst_example_simple_lap_s_facto1_sched0_not_svdbegin .................***Timeout 385.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.061186e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.892897e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.170572e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.610947e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.344857e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.098589e-01 s Time to initialize coeftab 5.137310e-01 s Time to factorize 1.086646e+01 s (493.18 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.586422e-01 s Time for refinement 2.843284e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907110e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907110e-07 max(|| b_i - A x_i ||_1) 8.452269e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062104e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.452269e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062104e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907110e-07 max(|| b_i - A x_i ||_1) 8.452269e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062104e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907110e-07 max(|| b_i - A x_i ||_1) 8.452269e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062104e+00 (SUCCESS) Start 2050: mpi_dst_example_simple_lap_s_facto1_sched0_not_svdbegin Test #1665: shm_example_simple_lap_z_facto1_sched4_not_rqrcpbegin ...................***Timeout 385.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.645588e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.874310e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.893164e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.326745e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.065977e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.161236e-01 s Time to initialize coeftab 4.124283e-01 s Time to factorize 3.270565e+00 s ( 6.52 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 4.946477e+00 s - iteration 1 : total iteration time 5.33 s error 1.2896e-14 Time for refinement 9.529867e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.289250e-14 max(|| b_i - A x_i ||_1) 2.055890e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.187709e-02 (SUCCESS) Test #2051: mpi_dst_example_simple_lap_s_facto1_sched0_not_svdend ...................***Timeout 385.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.574783e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.234574e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.106263e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.069493e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.764832e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.649996e-01 s Time to initialize coeftab 2.900511e-01 s Time to factorize 2.995411e+00 s ( 1.75 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.862135e-02 s Time for refinement 1.397402e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.685047e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.685047e-07 max(|| b_i - A x_i ||_1) 7.416855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.319949e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.416855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.319949e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.685047e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.685047e-07 max(|| b_i - A x_i ||_1) 7.416855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.319949e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.416855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.319949e-01 (SUCCESS) Start 2051: mpi_dst_example_simple_lap_s_facto1_sched0_not_svdend Test #2052: mpi_dst_example_simple_lap_s_facto1_sched0_kway_svdbegin ................***Timeout 385.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.176277e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.020477e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.027938e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.826428e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.637806e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.756345e-02 s Time to initialize coeftab 7.612422e-01 s Time to factorize 1.054007e+00 s ( 4.97 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.746585e-02 s Time for refinement 1.824819e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907110e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907110e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907110e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907110e-07 max(|| b_i - A x_i ||_1) 8.452269e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062104e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.452269e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062104e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.452269e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062104e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.452269e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062104e+00 (SUCCESS) Start 2052: mpi_dst_example_simple_lap_s_facto1_sched0_kway_svdbegin Test #2053: mpi_dst_example_simple_lap_s_facto1_sched0_kway_svdend ..................***Timeout 385.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.392120e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.356491e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.443685e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.766908e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.578568e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.291293e-01 s Time to initialize coeftab 2.692676e-01 s Time to factorize 5.009737e+00 s ( 1.04 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.495206e-01 s Time for refinement 1.430604e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.684969e-07 max(|| b_i - A x_i ||_1) 7.423341e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.328098e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.684969e-07 max(|| b_i - A x_i ||_1) 7.423341e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.328098e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.684969e-07 max(|| b_i - A x_i ||_1) 7.423341e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.328098e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.684969e-07 max(|| b_i - A x_i ||_1) 7.423341e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.328098e-01 (SUCCESS) Start 2053: mpi_dst_example_simple_lap_s_facto1_sched0_kway_svdend Test #2054: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_svdbegin .....***Timeout 385.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.320059e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.239766e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.382663e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.175668e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.477638e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.061933e-01 s Time to initialize coeftab 8.561289e-01 s Time to factorize 3.946621e+00 s ( 1.33 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.975770e-02 s Time for refinement 1.202214e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907110e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907110e-07 max(|| b_i - A x_i ||_1) 8.452269e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062104e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907110e-07 max(|| b_i - A x_i ||_1) 8.452269e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062104e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.452269e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062104e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.907110e-07 max(|| b_i - A x_i ||_1) 8.452269e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062104e+00 (SUCCESS) Start 2054: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_svdbegin Test #2056: mpi_dst_example_simple_lap_s_facto1_sched0_not_pqrcpbegin ...............***Timeout 385.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 3: 200 660 2: 200 760 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.290074e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.563756e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.675513e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.364169e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.460759e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.907729e-01 s Time to initialize coeftab 5.741582e-01 s Time to factorize 1.072918e+01 s (499.49 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.346832e-02 s - iteration 1 : total iteration time 0.113 s error 5.5113e-11 Time for refinement 2.416700e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.941202e-08 max(|| b_i - A x_i ||_1) 2.928208e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679558e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.941202e-08 max(|| b_i - A x_i ||_1) 2.928208e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679558e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.941202e-08 max(|| b_i - A x_i ||_1) 2.928208e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679558e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.941202e-08 max(|| b_i - A x_i ||_1) 2.928208e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679558e-01 (SUCCESS) Start 2056: mpi_dst_example_simple_lap_s_facto1_sched0_not_pqrcpbegin Test #2057: mpi_dst_example_simple_lap_s_facto1_sched0_not_pqrcpend .................***Timeout 385.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.741841e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.766975e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.870178e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.224017e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.095347e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.626836e-01 s Time to initialize coeftab 5.568973e-02 s Time to factorize 7.919110e-01 s ( 6.61 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.445224e-02 s Time for refinement 1.163342e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.985048e-07 max(|| b_i - A x_i ||_1) 8.595462e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.080097e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.985048e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.985048e-07 max(|| b_i - A x_i ||_1) 8.595462e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.080097e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.985048e-07 max(|| b_i - A x_i ||_1) 8.595462e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.080097e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.595462e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.080097e+00 (SUCCESS) Start 2057: mpi_dst_example_simple_lap_s_facto1_sched0_not_pqrcpend Test #2058: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpbegin ..............***Timeout 385.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.919067e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.443444e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.327367e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.806825e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.323277e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.939724e+00 s Time to initialize coeftab 5.480023e-01 s Time to factorize 5.199009e+00 s ( 1.01 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 4.055957e-01 s - iteration 1 : total iteration time 0.603 s error 5.5168e-11 Time for refinement 1.479311e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944054e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944054e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944054e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944054e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) Start 2058: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpbegin Test #2060: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpbegin ...***Timeout 385.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.817268e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.250922e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.652402e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.530515e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.251459e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.104018e-01 s Time to initialize coeftab 4.846664e-01 s Time to factorize 2.814047e+00 s ( 1.86 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.699579e-01 s - iteration 1 : total iteration time 0.399 s error 5.5073e-11 Time for refinement 9.268488e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944046e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944046e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944046e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.944046e-08 max(|| b_i - A x_i ||_1) 2.931508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683705e-01 (SUCCESS) Start 2060: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpbegin Test #1669: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpbegin .......***Timeout 384.97 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.380995e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.749661e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.362782e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.090298e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.811581e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.433999e-02 s Time to initialize coeftab 2.103361e+00 s Time to factorize 4.576354e+00 s ( 4.66 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 2.591030e+00 s - iteration 1 : total iteration time 1.4 s error 1.8646e-14 Time for refinement 3.552135e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.864449e-14 max(|| b_i - A x_i ||_1) 2.228044e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.622112e-02 (SUCCESS) Test #2063: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrcpend .................***Timeout 384.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.285026e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.564569e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.292464e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.327982e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.913943e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.415288e-01 s Time to initialize coeftab 2.945733e-01 s Time to factorize 2.729665e+00 s ( 1.92 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.449946e-01 s Time for refinement 5.390028e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.978469e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.978469e-07 max(|| b_i - A x_i ||_1) 8.532148e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.072141e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.532148e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.072141e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.978469e-07 max(|| b_i - A x_i ||_1) 8.532148e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.072141e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.978469e-07 max(|| b_i - A x_i ||_1) 8.532148e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.072141e+00 (SUCCESS) Start 2063: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrcpend Test #1670: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpend .........***Timeout 384.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.751141e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.322467e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.394902e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.005488e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.823401e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.989380e-01 s Time to initialize coeftab 1.169273e-01 s Time to factorize 7.188070e+00 s ( 2.96 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 3.458924e+00 s Time for refinement 1.401660e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.885522e-16 max(|| b_i - A x_i ||_1) 1.879544e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.742728e-03 (SUCCESS) Test #2064: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrcpbegin ..............***Timeout 384.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.094026e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.609901e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.629412e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.881858e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.203849e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.139035e-01 s Time to initialize coeftab 4.570807e-01 s Time to factorize 8.373308e-01 s ( 6.25 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 2.246828e-02 s - iteration 1 : total iteration time 0.00986 s error 5.5121e-11 Time for refinement 2.865204e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782463e-08 max(|| b_i - A x_i ||_1) 2.846896e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.577382e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782463e-08 max(|| b_i - A x_i ||_1) 2.846896e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.577382e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782463e-08 max(|| b_i - A x_i ||_1) 2.846896e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.577382e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782463e-08 max(|| b_i - A x_i ||_1) 2.846896e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.577382e-01 (SUCCESS) Start 2064: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrcpbegin Test #2065: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrcpend ................***Timeout 384.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.675801e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.063629e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.489401e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.089997e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.122202e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.287676e-01 s Time to initialize coeftab 4.997964e-01 s Time to factorize 4.580883e+00 s ( 1.14 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.459498e-01 s Time for refinement 3.037778e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.972569e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.972569e-07 max(|| b_i - A x_i ||_1) 8.513419e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069788e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.513419e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069788e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.972569e-07 max(|| b_i - A x_i ||_1) 8.513419e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069788e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.972569e-07 max(|| b_i - A x_i ||_1) 8.513419e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069788e+00 (SUCCESS) Start 2065: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrcpend Test #2066: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpbegin ...***Timeout 384.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.141928e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.742428e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.034550e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.088489e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.395603e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.510551e-01 s Time to initialize coeftab 2.273060e-01 s Time to factorize 5.410428e+00 s (990.51 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 1.048308e-01 s - iteration 1 : total iteration time 0.249 s error 5.5128e-11 Time for refinement 6.125524e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782640e-08 max(|| b_i - A x_i ||_1) 2.847811e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.578532e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782640e-08 max(|| b_i - A x_i ||_1) 2.847811e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.578532e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782640e-08 max(|| b_i - A x_i ||_1) 2.847811e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.578532e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.782640e-08 max(|| b_i - A x_i ||_1) 2.847811e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.578532e-01 (SUCCESS) Start 2066: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpbegin Test #2068: mpi_dst_example_simple_lap_s_facto1_sched0_not_tqrcpbegin ...............***Timeout 384.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.906415e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.739636e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.317690e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.019513e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.024336e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.194473e-03 s Time to initialize coeftab 5.875087e-01 s Time to factorize 6.486000e-01 s ( 8.07 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 5.268828e-03 s - iteration 1 : total iteration time 0.00248 s error 5.5143e-11 Time for refinement 8.685383e-03 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.860206e-08 max(|| b_i - A x_i ||_1) 2.884300e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.624383e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.860206e-08 max(|| b_i - A x_i ||_1) 2.884300e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.624383e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.860206e-08 max(|| b_i - A x_i ||_1) 2.884300e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.624383e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.860206e-08 max(|| b_i - A x_i ||_1) 2.884300e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.624383e-01 (SUCCESS) Start 2068: mpi_dst_example_simple_lap_s_facto1_sched0_not_tqrcpbegin Test #2069: mpi_dst_example_simple_lap_s_facto1_sched0_not_tqrcpend .................***Timeout 384.83 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.655153e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.103736e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.984243e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.830884e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.941184e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.886868e+00 s Time to initialize coeftab 5.231186e+00 s Time to factorize 7.791222e+00 s (687.84 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.248588e+00 s Time for refinement 1.986048e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.965841e-07 max(|| b_i - A x_i ||_1) 8.485211e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.066243e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.965841e-07 max(|| b_i - A x_i ||_1) 8.485211e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.066243e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.965841e-07 max(|| b_i - A x_i ||_1) 8.485211e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.066243e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.965841e-07 max(|| b_i - A x_i ||_1) 8.485211e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.066243e+00 (SUCCESS) Start 2069: mpi_dst_example_simple_lap_s_facto1_sched0_not_tqrcpend Test #1676: shm_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpend .........***Timeout 384.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.722585e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.969146e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.303174e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.591255e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.880450e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.972971e-01 s Time to initialize coeftab 1.625169e-01 s Time to factorize 2.362957e+00 s ( 9.02 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 5.726687e+00 s Time for refinement 1.346216e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.816038e-16 max(|| b_i - A x_i ||_1) 1.859162e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.691296e-03 (SUCCESS) Test #2071: mpi_dst_example_simple_lap_s_facto1_sched0_kway_tqrcpend ................***Timeout 384.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.031371e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.295467e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.677841e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.168679e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.409241e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.870928e+00 s Time to initialize coeftab 3.020296e-01 s Time to factorize 1.069720e+01 s (500.98 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.159911e-01 s Time for refinement 1.029009e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.971713e-07 max(|| b_i - A x_i ||_1) 8.508645e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069188e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.971713e-07 max(|| b_i - A x_i ||_1) 8.508645e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069188e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.971713e-07 max(|| b_i - A x_i ||_1) 8.508645e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069188e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.971713e-07 max(|| b_i - A x_i ||_1) 8.508645e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069188e+00 (SUCCESS) Start 2071: mpi_dst_example_simple_lap_s_facto1_sched0_kway_tqrcpend Test #1681: shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtbegin .......***Timeout 384.66 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.209721e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.968285e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.480794e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.971100e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.281845e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.643878e-02 s Time to initialize coeftab 2.153829e+00 s Time to factorize 4.365777e+00 s ( 4.88 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 3.349639e+00 s - iteration 1 : total iteration time 10.6 s error 2.8701e-13 Time for refinement 1.790987e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.870053e-13 max(|| b_i - A x_i ||_1) 3.784073e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.548499e-01 (SUCCESS) Test #1684: shm_example_simple_lap_z_facto1_sched4_kway_pqrcpilu1 ...................***Timeout 384.64 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.940758e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.109600e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.012495e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.166466e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.371942e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.435669e-01 s Time to initialize coeftab 2.332436e-01 s Time to factorize 2.762924e+00 s ( 7.71 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 3.983191e+00 s - iteration 1 : total iteration time 8.49 s error 1.0742e-15 Time for refinement 1.306535e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.086563e-15 max(|| b_i - A x_i ||_1) 1.243492e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.137753e-03 (SUCCESS) Test #2072: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpbegin ...***Timeout 384.63 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.582834e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.534368e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.318450e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.977914e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.823866e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.231160e-01 s Time to initialize coeftab 6.354045e-01 s Time to factorize 1.163850e+01 s (460.46 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 1.357992e-02 s - iteration 1 : total iteration time 0.0273 s error 5.5059e-11 Time for refinement 9.180602e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.863290e-08 max(|| b_i - A x_i ||_1) 2.888369e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629497e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.863290e-08 max(|| b_i - A x_i ||_1) 2.888369e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629497e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.863290e-08 max(|| b_i - A x_i ||_1) 2.888369e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629497e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.863290e-08 max(|| b_i - A x_i ||_1) 2.888369e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629497e-01 (SUCCESS) Start 2072: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpbegin Test #2075: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrrtend .................***Timeout 384.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.742710e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.503236e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.502437e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.801107e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.333440e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.122825e+00 s Time to initialize coeftab 2.730367e-01 s Time to factorize 4.031567e+00 s ( 1.30 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.245566e+00 s Time for refinement 1.243735e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.915137e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.915137e-07 max(|| b_i - A x_i ||_1) 8.567685e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076607e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.915137e-07 max(|| b_i - A x_i ||_1) 8.567685e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076607e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.567685e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076607e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.915137e-07 max(|| b_i - A x_i ||_1) 8.567685e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076607e+00 (SUCCESS) Start 2075: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrrtend Test #2082: mpi_dst_example_simple_lap_s_facto2_sched0_not_svdbegin .................***Timeout 384.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.345220e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.540493e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.162867e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.276489e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.437503e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.218222e-01 s Time to initialize coeftab 7.110030e-01 s Time to factorize 2.668039e+00 s ( 3.74 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.132932e-02 s Time for refinement 2.678466e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.909060e-07 max(|| b_i - A x_i ||_1) 8.321151e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.045628e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.909060e-07 max(|| b_i - A x_i ||_1) 8.321151e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.045628e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.909060e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.909060e-07 max(|| b_i - A x_i ||_1) 8.321151e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.045628e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.321151e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.045628e+00 (SUCCESS) Start 2082: mpi_dst_example_simple_lap_s_facto2_sched0_not_svdbegin Test #2087: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_svdend .......***Timeout 384.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.912981e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.579640e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.377813e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.679924e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.008201e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.006274e-01 s Time to initialize coeftab 6.573208e-01 s Time to factorize 3.271149e+00 s ( 3.05 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 6.935960e-02 s Time for refinement 9.059984e-02 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702367e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702367e-07 max(|| b_i - A x_i ||_1) 7.356731e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.244397e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702367e-07 max(|| b_i - A x_i ||_1) 7.356731e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.244397e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702367e-07 max(|| b_i - A x_i ||_1) 7.356731e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.244397e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.356731e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.244397e-01 (SUCCESS) Start 2087: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_svdend Test #2088: mpi_dst_example_simple_lap_s_facto2_sched0_not_pqrcpbegin ...............***Timeout 384.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.421933e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.445909e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.220437e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.334668e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.046585e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.511127e-01 s Time to initialize coeftab 4.320232e-01 s Time to factorize 5.941346e+00 s ( 1.68 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.497837e-01 s - iteration 1 : total iteration time 0.879 s error 9.1047e-11 Time for refinement 2.250659e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.209517e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.209517e-08 max(|| b_i - A x_i ||_1) 2.990139e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.757380e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.990139e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.757380e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.209517e-08 max(|| b_i - A x_i ||_1) 2.990139e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.757380e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.209517e-08 max(|| b_i - A x_i ||_1) 2.990139e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.757380e-01 (SUCCESS) Start 2088: mpi_dst_example_simple_lap_s_facto2_sched0_not_pqrcpbegin Test #2089: mpi_dst_example_simple_lap_s_facto2_sched0_not_pqrcpend .................***Timeout 384.59 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.911582e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.475053e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.290116e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.136700e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.515710e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.912587e-01 s Time to initialize coeftab 5.310224e-01 s Time to factorize 2.225890e+00 s ( 4.49 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 4.500289e-01 s - iteration 1 : total iteration time 0.736 s error 1.5144e-12 Time for refinement 1.968016e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.641699e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.641699e-08 max(|| b_i - A x_i ||_1) 2.715983e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412878e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.715983e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412878e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.641699e-08 max(|| b_i - A x_i ||_1) 2.715983e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412878e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.641699e-08 max(|| b_i - A x_i ||_1) 2.715983e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412878e-01 (SUCCESS) Start 2089: mpi_dst_example_simple_lap_s_facto2_sched0_not_pqrcpend Test #1707: shm_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpbegin .......***Timeout 384.58 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.959563e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.290280e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.593275e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 8.008440e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.069866e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.290684e-01 s Time to initialize coeftab 1.493473e+00 s Time to factorize 1.237964e+01 s ( 3.23 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 14.1 Ko Outside 16.9 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 1.49 Mo / 1.49 Mo ------------------------------------------------ Total 2.01 Mo / 2.01 Mo Time to solve 2.060300e+00 s - iteration 1 : total iteration time 3.73 s error 1.539e-14 Time for refinement 6.871601e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.539165e-14 max(|| b_i - A x_i ||_1) 1.687229e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.257452e-02 (SUCCESS) Test #1710: shm_example_simple_lap_z_facto2_sched4_not_rqrrtend .....................***Timeout 384.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.496096e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.220317e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.133570e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 7.694214e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.657121e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.539506e-01 s Time to initialize coeftab 5.284643e-02 s Time to factorize 3.603973e+00 s (11.09 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 14.1 Ko Outside 16.9 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 1.49 Mo / 1.49 Mo ------------------------------------------------ Total 2.01 Mo / 2.01 Mo Time to solve 6.796131e-01 s Time for refinement 5.408405e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.685963e-16 max(|| b_i - A x_i ||_1) 1.772093e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.471591e-03 (SUCCESS) Test #2091: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpend ................***Timeout 384.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.790986e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.687400e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.169197e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.480660e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.355857e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.127857e+00 s Time to initialize coeftab 3.260609e-01 s Time to factorize 5.390827e+00 s ( 1.85 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 7.793895e-01 s - iteration 1 : total iteration time 1.3 s error 1.4182e-12 Time for refinement 2.709731e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.638982e-08 max(|| b_i - A x_i ||_1) 2.711602e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.407373e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.638982e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.638982e-08 max(|| b_i - A x_i ||_1) 2.711602e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.407373e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.711602e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.407373e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.638982e-08 max(|| b_i - A x_i ||_1) 2.711602e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.407373e-01 (SUCCESS) Start 2091: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpend Test #2093: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_pqrcpend .....***Timeout 384.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.031369e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.793685e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.637430e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.082149e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.420002e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.865606e+00 s Time to initialize coeftab 1.283361e+00 s Time to factorize 5.753121e+00 s ( 1.74 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 4.064033e-01 s - iteration 1 : total iteration time 0.625 s error 1.5144e-12 Time for refinement 1.893614e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.641699e-08 max(|| b_i - A x_i ||_1) 2.715983e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412878e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.641699e-08 max(|| b_i - A x_i ||_1) 2.715983e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412878e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.641699e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.641699e-08 max(|| b_i - A x_i ||_1) 2.715983e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412878e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.715983e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.412878e-01 (SUCCESS) Start 2093: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_pqrcpend Test #2095: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrcpend .................***Timeout 384.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.568095e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.805054e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.397986e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.713019e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.026880e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.558147e+00 s Time to initialize coeftab 2.026395e+00 s Time to factorize 1.735946e+01 s (588.97 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 6.839875e-01 s - iteration 1 : total iteration time 1.66 s error 1.5135e-12 Time for refinement 2.608001e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.647298e-08 max(|| b_i - A x_i ||_1) 2.722673e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.421284e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.647298e-08 max(|| b_i - A x_i ||_1) 2.722673e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.421284e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.647298e-08 max(|| b_i - A x_i ||_1) 2.722673e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.421284e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.647298e-08 max(|| b_i - A x_i ||_1) 2.722673e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.421284e-01 (SUCCESS) Start 2095: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrcpend Test #1714: shm_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtend .........***Timeout 384.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.192160e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.121219e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.925860e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 1.514253e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.301330e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.368907e-02 s Time to initialize coeftab 6.009376e-02 s Time to factorize 1.373151e+00 s (29.11 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 14.1 Ko Outside 16.9 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 1.49 Mo / 1.49 Mo ------------------------------------------------ Total 2.01 Mo / 2.01 Mo Time to solve 2.348379e+00 s Time for refinement 1.609013e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.671728e-16 max(|| b_i - A x_i ||_1) 1.776422e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.482516e-03 (SUCCESS) 1932/3626 Test #2135: mpi_dst_example_simple_lap_d_facto0_sched0_kway_tqrcpend ................***Timeout 384.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.077118e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.981245e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.279071e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.340951e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.370008e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.640077e+00 s Time to initialize coeftab 1.970453e+00 s Time to factorize 1.322198e+01 s (392.06 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.099845e-01 s - iteration 1 : total iteration time 0.912 s error 2.4483e-16 Time for refinement 1.993152e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.724138e-16 max(|| b_i - A x_i ||_1) 7.143169e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.976005e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.724138e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.724138e-16 max(|| b_i - A x_i ||_1) 7.143169e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.976005e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.143169e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.976005e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.724138e-16 max(|| b_i - A x_i ||_1) 7.143169e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.976005e-04 (SUCCESS) Start 2135: mpi_dst_example_simple_lap_d_facto0_sched0_kway_tqrcpend 1932/3626 Test #2136: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpbegin ...***Timeout 384.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.272894e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.392071e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.947473e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.825343e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.926477e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.902386e+00 s Time to initialize coeftab 1.620750e+00 s Time to factorize 2.013082e+01 s (257.51 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.353924e-01 s - iteration 1 : total iteration time 0.281 s error 3.6959e-14 Time for refinement 7.779813e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695328e-14 max(|| b_i - A x_i ||_1) 6.876824e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641320e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695328e-14 max(|| b_i - A x_i ||_1) 6.876824e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641320e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695328e-14 max(|| b_i - A x_i ||_1) 6.876824e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641320e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695328e-14 max(|| b_i - A x_i ||_1) 6.876824e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641320e-02 (SUCCESS) Start 2136: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpbegin 1932/3626 Test #2137: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpend .....***Timeout 384.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.936534e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.436674e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.844714e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.897118e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.268103e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.500590e-02 s Time to initialize coeftab 5.604674e-01 s Time to factorize 3.394974e-01 s (14.91 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.067389e-03 s - iteration 1 : total iteration time 0.00613 s error 2.4483e-16 Time for refinement 2.282783e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.724130e-16 max(|| b_i - A x_i ||_1) 7.145535e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.978978e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.724130e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.724130e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.724130e-16 max(|| b_i - A x_i ||_1) 7.145535e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.978978e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.145535e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.978978e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.145535e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.978978e-04 (SUCCESS) Start 2137: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpend 1932/3626 Test #2138: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrrtbegin ...............***Timeout 384.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.069356e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.225238e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.802837e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.326548e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.559686e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.694396e-01 s Time to initialize coeftab 3.501979e-01 s Time to factorize 1.519652e+01 s (341.12 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.383370e-02 s - iteration 1 : total iteration time 0.00836 s error 6.4174e-13 Time for refinement 2.966284e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417494e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417494e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417494e-13 max(|| b_i - A x_i ||_1) 1.272694e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599249e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417494e-13 max(|| b_i - A x_i ||_1) 1.272694e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599249e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.272694e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599249e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.272694e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599249e+00 (SUCCESS) Start 2138: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrrtbegin 1932/3626 Test #2139: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrrtend .................***Timeout 384.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.719868e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.119046e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.034419e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.238002e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.373509e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.139938e+00 s Time to initialize coeftab 2.150010e-01 s Time to factorize 8.610249e+00 s (602.05 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.478887e-01 s - iteration 1 : total iteration time 1.14 s error 2.8928e-15 Time for refinement 2.113582e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.900447e-15 max(|| b_i - A x_i ||_1) 3.016773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790834e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.900447e-15 max(|| b_i - A x_i ||_1) 3.016773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790834e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.900447e-15 max(|| b_i - A x_i ||_1) 3.016773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790834e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.900447e-15 max(|| b_i - A x_i ||_1) 3.016773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790834e-03 (SUCCESS) Start 2139: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrrtend 1932/3626 Test #2140: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrrtbegin ..............***Timeout 384.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.214463e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.022696e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.773698e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.552628e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.767949e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.712902e+00 s Time to initialize coeftab 6.305205e-01 s Time to factorize 7.104323e+00 s (729.67 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.427614e-01 s - iteration 1 : total iteration time 1.42 s error 6.4174e-13 Time for refinement 2.874065e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417492e-13 max(|| b_i - A x_i ||_1) 1.272690e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599245e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417492e-13 max(|| b_i - A x_i ||_1) 1.272690e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599245e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417492e-13 max(|| b_i - A x_i ||_1) 1.272690e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599245e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417492e-13 max(|| b_i - A x_i ||_1) 1.272690e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599245e+00 (SUCCESS) Start 2140: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrrtbegin 1932/3626 Test #2141: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrrtend ................***Timeout 384.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.081029e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.045701e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.524261e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.259261e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.609705e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.437796e-01 s Time to initialize coeftab 2.419591e-01 s Time to factorize 2.796089e+00 s ( 1.81 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.920061e-01 s - iteration 1 : total iteration time 0.493 s error 2.8928e-15 Time for refinement 9.861333e-01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.900447e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.900447e-15 max(|| b_i - A x_i ||_1) 3.016773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790834e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.900447e-15 max(|| b_i - A x_i ||_1) 3.016773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790834e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.900447e-15 max(|| b_i - A x_i ||_1) 3.016773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790834e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 3.016773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790834e-03 (SUCCESS) Start 2141: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrrtend 1932/3626 Test #2142: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtbegin ...***Timeout 384.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.916824e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.730715e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.136355e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.812065e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.374780e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.517189e-01 s Time to initialize coeftab 4.571166e-01 s Time to factorize 1.089873e+01 s (475.63 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.036179e-01 s - iteration 1 : total iteration time 0.804 s error 6.4174e-13 Time for refinement 1.409290e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417494e-13 max(|| b_i - A x_i ||_1) 1.272694e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599249e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417494e-13 max(|| b_i - A x_i ||_1) 1.272694e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599249e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417494e-13 max(|| b_i - A x_i ||_1) 1.272694e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599249e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417494e-13 max(|| b_i - A x_i ||_1) 1.272694e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599249e+00 (SUCCESS) Start 2142: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtbegin 1932/3626 Test #2143: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtend .....***Timeout 384.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.917377e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.377237e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.000875e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.432909e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.467787e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.643512e-01 s Time to initialize coeftab 3.335030e-01 s Time to factorize 1.691369e+00 s ( 2.99 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.037450e-01 s - iteration 1 : total iteration time 1.14 s error 2.8928e-15 Time for refinement 2.161215e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.900447e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.900447e-15 max(|| b_i - A x_i ||_1) 3.016773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790834e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 3.016773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790834e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.900447e-15 max(|| b_i - A x_i ||_1) 3.016773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790834e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.900447e-15 max(|| b_i - A x_i ||_1) 3.016773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790834e-03 (SUCCESS) Start 2143: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtend 1932/3626 Test #2144: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpilu0 ...............***Timeout 384.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.091393e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.987612e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.891485e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.737906e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.108883e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.143366e+00 s Time to initialize coeftab 3.019481e-01 s Time to factorize 5.623116e+00 s (921.87 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.348526e-01 s - iteration 1 : total iteration time 0.333 s error 7.8828e-15 Time for refinement 7.372328e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.888032e-15 max(|| b_i - A x_i ||_1) 1.268629e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.594142e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.888032e-15 max(|| b_i - A x_i ||_1) 1.268629e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.594142e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.888032e-15 max(|| b_i - A x_i ||_1) 1.268629e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.594142e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.888032e-15 max(|| b_i - A x_i ||_1) 1.268629e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.594142e-02 (SUCCESS) Start 2144: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpilu0 1932/3626 Test #2145: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpilu1 ...............***Timeout 384.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.237120e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.599725e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.041137e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.361385e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.681298e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.082484e-01 s Time to initialize coeftab 5.518412e-01 s Time to factorize 2.338991e+00 s ( 2.16 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.029983e-01 s - iteration 1 : total iteration time 0.789 s error 7.8828e-15 Time for refinement 1.674326e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.888445e-15 max(|| b_i - A x_i ||_1) 1.268869e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.594443e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.888445e-15 max(|| b_i - A x_i ||_1) 1.268869e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.594443e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.888445e-15 max(|| b_i - A x_i ||_1) 1.268869e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.594443e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.888445e-15 max(|| b_i - A x_i ||_1) 1.268869e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.594443e-02 (SUCCESS) Start 2145: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpilu1 1932/3626 Test #2146: mpi_dst_example_simple_lap_d_facto1_sched0_not_svdbegin .................***Timeout 384.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.867070e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.140813e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.237286e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.771652e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.583485e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.207549e+00 s Time to initialize coeftab 7.872404e-01 s Time to factorize 3.068462e+01 s (174.65 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.012239e-01 s - iteration 1 : total iteration time 1.38 s error 1.969e-14 Time for refinement 2.682294e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969043e-14 max(|| b_i - A x_i ||_1) 3.896933e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.896831e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969043e-14 max(|| b_i - A x_i ||_1) 3.896933e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.896831e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969043e-14 max(|| b_i - A x_i ||_1) 3.896933e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.896831e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969043e-14 max(|| b_i - A x_i ||_1) 3.896933e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.896831e-02 (SUCCESS) Start 2146: mpi_dst_example_simple_lap_d_facto1_sched0_not_svdbegin 1932/3626 Test #2147: mpi_dst_example_simple_lap_d_facto1_sched0_not_svdend ...................***Timeout 384.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.105770e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.314287e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.971309e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.455286e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.572305e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.021843e+00 s Time to initialize coeftab 5.143210e-01 s Time to factorize 9.491560e+00 s (564.62 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.120146e-01 s - iteration 1 : total iteration time 0.495 s error 2.4e-16 Time for refinement 9.653191e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.632877e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.632877e-16 max(|| b_i - A x_i ||_1) 7.037298e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.842970e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.037298e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.842970e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.632877e-16 max(|| b_i - A x_i ||_1) 7.037298e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.842970e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.632877e-16 max(|| b_i - A x_i ||_1) 7.037298e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.842970e-04 (SUCCESS) Start 2147: mpi_dst_example_simple_lap_d_facto1_sched0_not_svdend 1932/3626 Test #2148: mpi_dst_example_simple_lap_d_facto1_sched0_kway_svdbegin ................***Timeout 384.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.250696e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.662433e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.952625e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.531062e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.640821e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.300261e-01 s Time to initialize coeftab 5.663184e-01 s Time to factorize 6.386223e+00 s (839.16 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.780510e-01 s - iteration 1 : total iteration time 0.354 s error 1.969e-14 Time for refinement 5.603605e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.968912e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.968912e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.968912e-14 max(|| b_i - A x_i ||_1) 3.896748e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.896598e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.968912e-14 max(|| b_i - A x_i ||_1) 3.896748e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.896598e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.896748e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.896598e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.896748e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.896598e-02 (SUCCESS) Start 2148: mpi_dst_example_simple_lap_d_facto1_sched0_kway_svdbegin 1932/3626 Test #2149: mpi_dst_example_simple_lap_d_facto1_sched0_kway_svdend ..................***Timeout 384.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.571964e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.314191e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.638381e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.087971e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.516082e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.937049e+00 s Time to initialize coeftab 1.468236e+00 s Time to factorize 1.105611e+01 s (484.72 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.020517e-01 s - iteration 1 : total iteration time 0.972 s error 2.4e-16 Time for refinement 2.260339e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.632877e-16 max(|| b_i - A x_i ||_1) 7.037298e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.842970e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.632877e-16 max(|| b_i - A x_i ||_1) 7.037298e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.842970e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.632877e-16 max(|| b_i - A x_i ||_1) 7.037298e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.842970e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.632877e-16 max(|| b_i - A x_i ||_1) 7.037298e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.842970e-04 (SUCCESS) Start 2149: mpi_dst_example_simple_lap_d_facto1_sched0_kway_svdend 1932/3626 Test #2150: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_svdbegin .....***Timeout 384.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.537774e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.866833e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.335980e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.261281e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.081616e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.737716e-01 s Time to initialize coeftab 5.815388e-01 s Time to factorize 7.200089e+00 s (744.31 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.327419e-01 s - iteration 1 : total iteration time 0.316 s error 1.969e-14 Time for refinement 7.143388e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969043e-14 max(|| b_i - A x_i ||_1) 3.896933e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.896831e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969043e-14 max(|| b_i - A x_i ||_1) 3.896933e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.896831e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969043e-14 max(|| b_i - A x_i ||_1) 3.896933e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.896831e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969043e-14 max(|| b_i - A x_i ||_1) 3.896933e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.896831e-02 (SUCCESS) Start 2150: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_svdbegin 1932/3626 Test #2151: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_svdend .......***Timeout 384.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.170061e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.594694e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.000441e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.916129e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.612514e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.041270e-01 s Time to initialize coeftab 5.530948e-01 s Time to factorize 1.893768e+00 s ( 2.76 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.040247e-02 s - iteration 1 : total iteration time 0.0521 s error 2.4e-16 Time for refinement 1.552498e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.636169e-16 max(|| b_i - A x_i ||_1) 7.068772e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.882520e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.636169e-16 max(|| b_i - A x_i ||_1) 7.068772e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.882520e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.636169e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.636169e-16 max(|| b_i - A x_i ||_1) 7.068772e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.882520e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.068772e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.882520e-04 (SUCCESS) Start 2151: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_svdend 1932/3626 Test #2152: mpi_dst_example_simple_lap_d_facto1_sched0_not_pqrcpbegin ...............***Timeout 384.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.604165e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.400048e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.200555e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.699767e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.913631e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.052280e-01 s Time to initialize coeftab 7.126953e-01 s Time to factorize 2.392263e+01 s (224.02 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.110093e-01 s - iteration 1 : total iteration time 0.212 s error 1.52e-14 Time for refinement 4.198420e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519674e-14 max(|| b_i - A x_i ||_1) 2.802993e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.522201e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519674e-14 max(|| b_i - A x_i ||_1) 2.802993e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.522201e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519674e-14 max(|| b_i - A x_i ||_1) 2.802993e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.522201e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519674e-14 max(|| b_i - A x_i ||_1) 2.802993e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.522201e-02 (SUCCESS) Start 2152: mpi_dst_example_simple_lap_d_facto1_sched0_not_pqrcpbegin 1932/3626 Test #2153: mpi_dst_example_simple_lap_d_facto1_sched0_not_pqrcpend .................***Timeout 384.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.918949e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.708075e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.684655e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.437022e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.609554e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.160489e+00 s Time to initialize coeftab 1.987539e-01 s Time to factorize 1.499884e+00 s ( 3.49 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.266417e-01 s - iteration 1 : total iteration time 0.626 s error 2.4333e-16 Time for refinement 1.082048e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.689246e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.689246e-16 max(|| b_i - A x_i ||_1) 6.906050e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.678046e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 6.906050e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.678046e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.689246e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.689246e-16 max(|| b_i - A x_i ||_1) 6.906050e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.678046e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 6.906050e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.678046e-04 (SUCCESS) Start 2153: mpi_dst_example_simple_lap_d_facto1_sched0_not_pqrcpend 1932/3626 Test #2155: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpend ................***Timeout 384.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.195695e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.024433e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.923508e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.370274e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.605549e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.794683e-01 s Time to initialize coeftab 5.370729e-01 s Time to factorize 3.722011e+00 s ( 1.41 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.259970e-01 s - iteration 1 : total iteration time 0.844 s error 2.4333e-16 Time for refinement 1.463325e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.689246e-16 max(|| b_i - A x_i ||_1) 6.906050e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.678046e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.689246e-16 max(|| b_i - A x_i ||_1) 6.906050e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.678046e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.689246e-16 max(|| b_i - A x_i ||_1) 6.906050e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.678046e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.689246e-16 max(|| b_i - A x_i ||_1) 6.906050e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.678046e-04 (SUCCESS) Start 2155: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpend 1932/3626 Test #2156: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpbegin ...***Timeout 384.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.105222e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.961415e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.309723e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.570419e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.295307e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.027148e-01 s Time to initialize coeftab 4.370584e-01 s Time to factorize 5.289988e+00 s (1013.06 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.790774e-01 s - iteration 1 : total iteration time 0.298 s error 1.52e-14 Time for refinement 7.504192e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519674e-14 max(|| b_i - A x_i ||_1) 2.802993e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.522201e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519674e-14 max(|| b_i - A x_i ||_1) 2.802993e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.522201e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519674e-14 max(|| b_i - A x_i ||_1) 2.802993e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.522201e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519674e-14 max(|| b_i - A x_i ||_1) 2.802993e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.522201e-02 (SUCCESS) Start 2156: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpbegin 1932/3626 Test #2157: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpend .....***Timeout 384.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.127506e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.257563e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.404794e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.494987e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.400968e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.024749e+01 s Time to initialize coeftab 9.049727e-02 s Time to factorize 2.090925e+00 s ( 2.50 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.269886e-01 s - iteration 1 : total iteration time 0.336 s error 2.4333e-16 Time for refinement 7.353310e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.689246e-16 max(|| b_i - A x_i ||_1) 6.906050e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.678046e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.689246e-16 max(|| b_i - A x_i ||_1) 6.906050e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.678046e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.689246e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.689246e-16 max(|| b_i - A x_i ||_1) 6.906050e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.678046e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 6.906050e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.678046e-04 (SUCCESS) Start 2157: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpend 1932/3626 Test #2158: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrcpbegin ...............***Timeout 384.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.368655e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.198718e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.782579e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.078317e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.595695e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.214348e-01 s Time to initialize coeftab 4.289941e-01 s Time to factorize 1.010502e+01 s (530.34 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.550300e-01 s - iteration 1 : total iteration time 0.58 s error 3.6959e-14 Time for refinement 1.021412e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695451e-14 max(|| b_i - A x_i ||_1) 6.878037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642845e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695451e-14 max(|| b_i - A x_i ||_1) 6.878037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642845e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695451e-14 max(|| b_i - A x_i ||_1) 6.878037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642845e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695451e-14 max(|| b_i - A x_i ||_1) 6.878037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642845e-02 (SUCCESS) Start 2158: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrcpbegin 1932/3626 Test #2159: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrcpend .................***Timeout 383.98 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.248873e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.092800e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.015992e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.703335e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.614668e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.103957e-01 s Time to initialize coeftab 5.484291e-01 s Time to factorize 1.016944e+01 s (526.98 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.280004e-01 s - iteration 1 : total iteration time 0.734 s error 2.4483e-16 Time for refinement 1.481337e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.679365e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.679365e-16 max(|| b_i - A x_i ||_1) 6.888061e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.655440e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 6.888061e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.655440e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.679365e-16 max(|| b_i - A x_i ||_1) 6.888061e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.655440e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.679365e-16 max(|| b_i - A x_i ||_1) 6.888061e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.655440e-04 (SUCCESS) Start 2159: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrcpend 1932/3626 Test #2160: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrcpbegin ..............***Timeout 383.97 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.891020e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.983266e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.552069e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.735644e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.063228e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.941216e-01 s Time to initialize coeftab 4.781217e-01 s Time to factorize 3.709599e+00 s ( 1.41 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.708384e-01 s - iteration 1 : total iteration time 0.344 s error 3.6959e-14 Time for refinement 5.943294e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695451e-14 max(|| b_i - A x_i ||_1) 6.878037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642845e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695451e-14 max(|| b_i - A x_i ||_1) 6.878037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642845e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695451e-14 max(|| b_i - A x_i ||_1) 6.878037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642845e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695451e-14 max(|| b_i - A x_i ||_1) 6.878037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642845e-02 (SUCCESS) Start 2160: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrcpbegin 1932/3626 Test #2161: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrcpend ................***Timeout 383.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.371623e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.122227e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.708043e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.899612e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.975522e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.467476e-01 s Time to initialize coeftab 6.123526e-01 s Time to factorize 1.109959e+01 s (482.82 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.310007e-01 s - iteration 1 : total iteration time 1.18 s error 2.4483e-16 Time for refinement 2.339994e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.679925e-16 max(|| b_i - A x_i ||_1) 6.905248e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.677037e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.679925e-16 max(|| b_i - A x_i ||_1) 6.905248e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.677037e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.679925e-16 max(|| b_i - A x_i ||_1) 6.905248e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.677037e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.679925e-16 max(|| b_i - A x_i ||_1) 6.905248e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.677037e-04 (SUCCESS) Start 2161: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrcpend 1932/3626 Test #2162: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpbegin ...***Timeout 383.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.306139e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.084788e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.064991e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.391236e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.527296e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.853577e-01 s Time to initialize coeftab 6.822558e-01 s [arch-nspawn-3655178:1056967] *** Process received signal *** [arch-nspawn-3655178:1056967] Signal: Segmentation fault (11) [arch-nspawn-3655178:1056967] Signal code: Address not mapped (1) [arch-nspawn-3655178:1056967] Failing at address: 0x7f62151a1860 [arch-nspawn-3655178:1056967] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7fca99cc46cc] [arch-nspawn-3655178:1056967] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7fca8baf3a02] [arch-nspawn-3655178:1056967] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7fca8baf4504] [arch-nspawn-3655178:1056967] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7fca8baa5a7a] [arch-nspawn-3655178:1056967] [ 4] /usr/lib/libmpi.so.40(ompi_request_default_test_some+0x1a4) [0x7fca98280166] [arch-nspawn-3655178:1056967] [ 5] /usr/lib/libmpi.so.40(MPI_Testsome+0xbe) [0x7fca982d07f0] [arch-nspawn-3655178:1056967] [ 6] /build/pastix/src/build/kernels/libpastix_kernels.so(cpucblk_dmpi_progress+0x1ae) [0x7fca98638776] [arch-nspawn-3655178:1056967] [ 7] /build/pastix/src/build/kernels/libpastix_kernels.so(cpucblk_dincoming_deps+0x3e) [0x7fca98638be2] [arch-nspawn-3655178:1056967] [ 8] /build/pastix/src/build/libpastix.so.6.4(sequential_dsytrf+0x98) [0x7fca98719b26] [arch-nspawn-3655178:1056967] [ 9] /build/pastix/src/build/libpastix.so.6.4(sopalin_dsytrf+0x82) [0x7fca98721682] [arch-nspawn-3655178:1056967] [10] /build/pastix/src/build/libpastix.so.6.4(pastix_subtask_sopalin+0x190) [0x7fca986fdd78] [arch-nspawn-3655178:1056967] [11] ./simple(+0xece) [0x555555556ece] [arch-nspawn-3655178:1056967] [12] /usr/lib/libc.so.6(+0x27fae) [0x7fca980a4fae] [arch-nspawn-3655178:1056967] [13] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7fca980a50b8] [arch-nspawn-3655178:1056967] [14] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:1056967] *** End of error message *** [arch-nspawn-3655178:1056964] *** Process received signal *** [arch-nspawn-3655178:1056960] *** Process received signal *** [arch-nspawn-3655178:1056960] Signal: Segmentation fault (11) [arch-nspawn-3655178:1056960] Signal code: Address not mapped (1) [arch-nspawn-3655178:1056960] Failing at address: 0x7f9dc8aa23e0 [arch-nspawn-3655178:1056960] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7f59f2afa6cc] [arch-nspawn-3655178:1056960] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7f59f09a3a02] [arch-nspawn-3655178:1056960] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7f59f09a4504] [arch-nspawn-3655178:1056960] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7f59f0955a7a] [arch-nspawn-3655178:1056960] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7f59f0982aa2] [arch-nspawn-3655178:1056960] [ 5] /usr/lib/libmpi.so.40(+0x7de1a) [0x7f59f107de1a] [arch-nspawn-3655178:1056960] [ 6] /usr/lib/libmpi.so.40(ompi_request_default_wait+0x1a) [0x7f59f108019c] [arch-nspawn-3655178:1056960] [ 7] /usr/lib/libmpi.so.40(PMPI_Wait+0x4c) [0x7f59f10db200] [arch-nspawn-3655178:1056960] [ 8] /build/pastix/src/build/kernels/libpastix_kernels.so(cpucblk_drequest_cleanup+0xae) [0x7f59f0fd818e] [arch-nspawn-3655178:1056960] [ 9] /build/pastix/src/build/libpastix.so.6.4(sopalin_dsytrf+0x94) [0x7f59f1549694] [arch-nspawn-3655178:1056960] [10] /build/pastix/src/build/libpastix.so.6.4(pastix_subtask_sopalin+0x190) [0x7f59f1525d78] [arch-nspawn-3655178:1056960] [11] ./simple(+0xece) [0x555555556ece] [arch-nspawn-3655178:1056960] [12] /usr/lib/libc.so.6(+0x27fae) [0x7f59f132cfae] [arch-nspawn-3655178:1056960] [13] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7f59f132d0b8] [arch-nspawn-3655178:1056960] [14] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:1056960] *** End of error message *** [arch-nspawn-3655178:1056964] Signal: Segmentation fault (11) [arch-nspawn-3655178:1056964] Signal code: Address not mapped (1) [arch-nspawn-3655178:1056964] Failing at address: 0x7fcfd56a1660 [arch-nspawn-3655178:1056964] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7f9923ebd6cc] [arch-nspawn-3655178:1056964] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7f9921ddaa02] [arch-nspawn-3655178:1056964] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7f9921ddb504] [arch-nspawn-3655178:1056964] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7f9921d8ca7a] [arch-nspawn-3655178:1056964] [ 4] /usr/lib/libmpi.so.40(ompi_request_default_test_some+0x1a4) [0x7f9922680166] Start 2162: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpbegin 1932/3626 Test #2163: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpend .....***Timeout 383.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.791752e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.845664e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.003867e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.728620e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.234017e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.153337e-01 s Time to initialize coeftab 2.723782e+00 s Time to factorize 1.496305e+00 s ( 3.50 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.716659e-01 s - iteration 1 : total iteration time 0.106 s error 2.4483e-16 Time for refinement 2.788308e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.679365e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.679365e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.679365e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.679365e-16 max(|| b_i - A x_i ||_1) 6.888061e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.655440e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 6.888061e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.655440e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 6.888061e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.655440e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 6.888061e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.655440e-04 (SUCCESS) Start 2163: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpend 1932/3626 Test #2164: mpi_dst_example_simple_lap_d_facto1_sched0_not_tqrcpbegin ...............***Timeout 383.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.229840e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.591066e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.080970e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.065873e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.442220e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.896691e-01 s Time to initialize coeftab 1.052284e+00 s Time to factorize 7.708456e+00 s (695.22 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.189859e-01 s - iteration 1 : total iteration time 0.48 s error 3.6959e-14 Time for refinement 9.171046e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695448e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695448e-14 max(|| b_i - A x_i ||_1) 6.877878e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642645e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 6.877878e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642645e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695448e-14 max(|| b_i - A x_i ||_1) 6.877878e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642645e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695448e-14 max(|| b_i - A x_i ||_1) 6.877878e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642645e-02 (SUCCESS) Start 2164: mpi_dst_example_simple_lap_d_facto1_sched0_not_tqrcpbegin 1932/3626 Test #2165: mpi_dst_example_simple_lap_d_facto1_sched0_not_tqrcpend .................***Timeout 383.89 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.051788e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.312678e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.623695e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.801668e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.006482e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.580900e+00 s Time to initialize coeftab 1.395893e+00 s Time to factorize 9.374053e+00 s (571.69 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.338000e+00 s - iteration 1 : total iteration time 1.07 s error 2.4483e-16 Time for refinement 2.223150e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.680173e-16 max(|| b_i - A x_i ||_1) 6.903412e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.674731e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.680173e-16 max(|| b_i - A x_i ||_1) 6.903412e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.674731e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.680173e-16 max(|| b_i - A x_i ||_1) 6.903412e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.674731e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.680173e-16 max(|| b_i - A x_i ||_1) 6.903412e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.674731e-04 (SUCCESS) Start 2165: mpi_dst_example_simple_lap_d_facto1_sched0_not_tqrcpend 1932/3626 Test #2166: mpi_dst_example_simple_lap_d_facto1_sched0_kway_tqrcpbegin ..............***Timeout 383.88 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.933577e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.648501e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.648052e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.759622e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.389116e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.249686e-01 s Time to initialize coeftab 5.619715e-01 s Time to factorize 9.492845e+00 s (564.54 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.410944e-01 s - iteration 1 : total iteration time 0.24 s error 3.6959e-14 Time for refinement 7.286300e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695448e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695448e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695448e-14 max(|| b_i - A x_i ||_1) 6.877887e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642656e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 6.877887e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642656e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 6.877887e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642656e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695448e-14 max(|| b_i - A x_i ||_1) 6.877887e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642656e-02 (SUCCESS) Start 2166: mpi_dst_example_simple_lap_d_facto1_sched0_kway_tqrcpbegin 1932/3626 Test #2167: mpi_dst_example_simple_lap_d_facto1_sched0_kway_tqrcpend ................***Timeout 383.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.867612e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.119873e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.281574e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.976125e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.124830e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.823323e-01 s Time to initialize coeftab 5.288775e-01 s Time to factorize 2.495694e+00 s ( 2.10 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.398019e-02 s - iteration 1 : total iteration time 0.0616 s error 2.4483e-16 Time for refinement 2.298418e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.680173e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.680173e-16 max(|| b_i - A x_i ||_1) 6.903412e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.674731e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 6.903412e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.674731e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.680173e-16 max(|| b_i - A x_i ||_1) 6.903412e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.674731e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.680173e-16 max(|| b_i - A x_i ||_1) 6.903412e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.674731e-04 (SUCCESS) Start 2167: mpi_dst_example_simple_lap_d_facto1_sched0_kway_tqrcpend 1932/3626 Test #2168: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpbegin ...***Timeout 383.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.877314e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.957954e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.782859e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.758061e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.405371e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.159588e+00 s Time to initialize coeftab 6.044498e-01 s Time to factorize 3.140227e+01 s (170.66 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.080604e-01 s - iteration 1 : total iteration time 1.18 s error 3.6959e-14 Time for refinement 2.288590e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695448e-14 max(|| b_i - A x_i ||_1) 6.877878e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642645e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695448e-14 max(|| b_i - A x_i ||_1) 6.877878e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642645e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695448e-14 max(|| b_i - A x_i ||_1) 6.877878e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642645e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695448e-14 max(|| b_i - A x_i ||_1) 6.877878e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.642645e-02 (SUCCESS) Start 2168: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpbegin 1932/3626 Test #2169: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpend .....***Timeout 383.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.151398e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.337410e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.298571e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.529784e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.653714e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.512885e-01 s Time to initialize coeftab 5.947648e-01 s Time to factorize 9.407901e+00 s (569.64 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.559546e-01 s - iteration 1 : total iteration time 0.693 s error 2.4483e-16 Time for refinement 1.429997e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.680173e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.680173e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.680173e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.680173e-16 max(|| b_i - A x_i ||_1) 6.903412e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.674731e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 6.903412e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.674731e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 6.903412e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.674731e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 6.903412e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.674731e-04 (SUCCESS) Start 2169: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpend 1932/3626 Test #2171: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrrtend .................***Timeout 383.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.368868e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.459343e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.999381e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.485876e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.684931e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.983576e-01 s Time to initialize coeftab 3.083013e-01 s Time to factorize 4.476673e+00 s ( 1.17 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.426352e-01 s - iteration 1 : total iteration time 1.03 s error 2.8928e-15 Time for refinement 2.268389e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897582e-15 max(|| b_i - A x_i ||_1) 3.010213e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.782591e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897582e-15 max(|| b_i - A x_i ||_1) 3.010213e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.782591e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897582e-15 max(|| b_i - A x_i ||_1) 3.010213e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.782591e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897582e-15 max(|| b_i - A x_i ||_1) 3.010213e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.782591e-03 (SUCCESS) Start 2171: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrrtend 1932/3626 Test #2172: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrrtbegin ..............***Timeout 383.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.049824e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.082784e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.740070e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.932966e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.945991e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.428643e+00 s Time to initialize coeftab 1.990577e+00 s Time to factorize 1.004548e+01 s (533.48 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.176295e-01 s - iteration 1 : total iteration time 0.115 s error 6.4174e-13 Time for refinement 2.704904e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417417e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417417e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417417e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417417e-13 max(|| b_i - A x_i ||_1) 1.272697e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599253e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.272697e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599253e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.272697e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599253e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.272697e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599253e+00 (SUCCESS) Start 2172: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrrtbegin 1932/3626 Test #2173: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrrtend ................***Timeout 383.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.563330e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.778941e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.477652e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.577085e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.892922e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.936514e-01 s Time to initialize coeftab 3.086581e-01 s Time to factorize 4.671570e+00 s ( 1.12 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.985182e-01 s - iteration 1 : total iteration time 1.25 s error 2.8928e-15 Time for refinement 2.672607e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897596e-15 max(|| b_i - A x_i ||_1) 3.012481e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785442e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897596e-15 max(|| b_i - A x_i ||_1) 3.012481e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785442e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897596e-15 max(|| b_i - A x_i ||_1) 3.012481e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785442e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897596e-15 max(|| b_i - A x_i ||_1) 3.012481e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785442e-03 (SUCCESS) Start 2173: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrrtend 1932/3626 Test #2174: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtbegin ...***Timeout 383.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.227448e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.937554e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.743856e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.907139e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.911802e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.781610e-01 s Time to initialize coeftab 7.718632e-01 s Time to factorize 1.005169e+01 s (533.15 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.286556e-01 s - iteration 1 : total iteration time 0.162 s error 6.4174e-13 Time for refinement 3.391526e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417417e-13 max(|| b_i - A x_i ||_1) 1.272697e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599253e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417417e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417417e-13 max(|| b_i - A x_i ||_1) 1.272697e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599253e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.272697e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599253e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417417e-13 max(|| b_i - A x_i ||_1) 1.272697e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599253e+00 (SUCCESS) Start 2174: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtbegin 1932/3626 Test #2175: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtend .....***Timeout 383.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.391885e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.555159e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.482374e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.495283e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.814793e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.035123e-01 s Time to initialize coeftab 3.611597e-01 s Time to factorize 2.422017e+00 s ( 2.16 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.880055e-01 s - iteration 1 : total iteration time 0.335 s error 2.8928e-15 Time for refinement 6.600852e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897582e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897582e-15 max(|| b_i - A x_i ||_1) 3.010213e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.782591e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897582e-15 max(|| b_i - A x_i ||_1) 3.010213e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.782591e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897582e-15 max(|| b_i - A x_i ||_1) 3.010213e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.782591e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 3.010213e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.782591e-03 (SUCCESS) Start 2175: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtend Test #2006: mpi_dst_example_simple_lap_d_facto1_sched4_1d ...........................***Timeout 383.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.178459e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.218795e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.197762e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.060948e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.660509e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.127062e-01 s Time to initialize coeftab 4.839620e-01 s Time to factorize 3.547788e+00 s ( 1.48 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 6.967900e+00 s Time for refinement 2.389698e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.903953e-16 max(|| b_i - A x_i ||_1) 1.677454e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.107865e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.903953e-16 max(|| b_i - A x_i ||_1) 1.677454e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.107865e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.903953e-16 max(|| b_i - A x_i ||_1) 1.677454e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.107865e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.903953e-16 max(|| b_i - A x_i ||_1) 1.677454e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.107865e-03 (SUCCESS) Start 2006: mpi_dst_example_simple_lap_d_facto1_sched4_1d Test #1649: shm_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtbegin .......***Timeout 383.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.454736e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.999374e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.437095e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 6.102954e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.268478e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.641873e-02 s Time to initialize coeftab 7.145124e-01 s Time to factorize 5.271239e+00 s ( 3.85 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 1.562722e+00 s - iteration 1 : total iteration time 2.14 s error 1.0842e-13 Time for refinement 4.118095e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.084281e-13 max(|| b_i - A x_i ||_1) 2.430242e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.132326e-01 (SUCCESS) Test #2037: mpi_dst_example_simple_lap_s_facto0_sched0_not_tqrcpend .................***Timeout 383.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.834798e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.304293e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.497054e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.916278e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.654240e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.517438e-01 s Time to initialize coeftab 6.768385e-01 s Time to factorize 8.417939e+00 s (615.80 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.755357e-01 s Time for refinement 8.216844e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.111303e-07 max(|| b_i - A x_i ||_1) 9.597788e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206049e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.111303e-07 max(|| b_i - A x_i ||_1) 9.597788e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206049e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.111303e-07 max(|| b_i - A x_i ||_1) 9.597788e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206049e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.111303e-07 max(|| b_i - A x_i ||_1) 9.597788e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206049e+00 (SUCCESS) Start 2037: mpi_dst_example_simple_lap_s_facto0_sched0_not_tqrcpend Test #2042: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrrtbegin ...............***Timeout 383.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.140820e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.783141e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.560884e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.762092e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.619068e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.000874e-01 s Time to initialize coeftab 4.789061e-01 s Time to factorize 3.438849e+00 s ( 1.47 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.435373e-01 s - iteration 1 : total iteration time 0.503 s error 5.5928e-11 Time for refinement 8.817836e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) Start 2042: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrrtbegin Test #2044: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrrtbegin ..............***Timeout 383.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.177918e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.528423e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.341497e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.217532e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.905651e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.824779e+00 s Time to initialize coeftab 4.047219e-01 s Time to factorize 1.210401e+01 s (428.27 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.829473e-01 s - iteration 1 : total iteration time 0.176 s error 5.5928e-11 Time for refinement 4.326870e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.770989e-08 max(|| b_i - A x_i ||_1) 2.855581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.588296e-01 (SUCCESS) Start 2044: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrrtbegin Test #2045: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrrtend ................***Timeout 383.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.986857e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.006095e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.704169e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.948544e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.222405e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.773690e-01 s Time to initialize coeftab 6.859808e-01 s Time to factorize 1.010000e+00 s ( 5.01 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.069925e-01 s Time for refinement 8.997107e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041276e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041276e-07 max(|| b_i - A x_i ||_1) 9.559704e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201263e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041276e-07 max(|| b_i - A x_i ||_1) 9.559704e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201263e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.559704e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201263e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.041276e-07 max(|| b_i - A x_i ||_1) 9.559704e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.201263e+00 (SUCCESS) Start 2045: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrrtend Test #2055: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_svdend .......***Timeout 383.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.722086e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.670743e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.499907e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.282250e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.857960e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.113135e-01 s Time to initialize coeftab 6.368954e-01 s Time to factorize 3.179002e+00 s ( 1.65 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.075814e-01 s Time for refinement 9.321987e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.685047e-07 max(|| b_i - A x_i ||_1) 7.416855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.319949e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.685047e-07 max(|| b_i - A x_i ||_1) 7.416855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.319949e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.685047e-07 max(|| b_i - A x_i ||_1) 7.416855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.319949e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.685047e-07 max(|| b_i - A x_i ||_1) 7.416855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.319949e-01 (SUCCESS) Start 2055: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_svdend Test #2059: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpend ................***Timeout 383.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.373838e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.537898e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.444875e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.142336e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.497877e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.732612e-01 s Time to initialize coeftab 6.321530e-01 s Time to factorize 8.335060e-01 s ( 6.28 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.861757e-02 s Time for refinement 1.581890e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.983820e-07 max(|| b_i - A x_i ||_1) 8.588977e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.079283e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.983820e-07 max(|| b_i - A x_i ||_1) 8.588977e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.079283e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.983820e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.983820e-07 max(|| b_i - A x_i ||_1) 8.588977e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.079283e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.588977e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.079283e+00 (SUCCESS) Start 2059: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpend Test #1668: shm_example_simple_lap_z_facto1_sched4_kway_rqrcpend ....................***Timeout 383.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.734454e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.852855e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.923423e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.687048e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.858345e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.123825e-01 s Time to initialize coeftab 8.951993e-02 s Time to factorize 2.212334e+00 s ( 9.63 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 3.390584e+00 s Time for refinement 1.675121e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.755697e-16 max(|| b_i - A x_i ||_1) 1.835514e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.631625e-03 (SUCCESS) Test #1671: shm_example_simple_lap_z_facto1_sched4_not_tqrcpbegin ...................***Timeout 383.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.218388e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.728779e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.600428e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.635999e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.340348e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.614698e-01 s Time to initialize coeftab 2.320610e+00 s Time to factorize 7.848937e+00 s ( 2.71 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 3.810895e+00 s - iteration 1 : total iteration time 6.27 s error 1.0663e-14 Time for refinement 1.609433e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.065763e-14 max(|| b_i - A x_i ||_1) 1.477996e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.729485e-02 (SUCCESS) Test #2070: mpi_dst_example_simple_lap_s_facto1_sched0_kway_tqrcpbegin ..............***Timeout 383.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.398552e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.316850e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.250027e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.658225e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.595134e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.456441e-02 s Time to initialize coeftab 7.596618e-01 s Time to factorize 2.652645e+00 s ( 1.97 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 6.104380e-02 s - iteration 1 : total iteration time 0.0228 s error 5.5143e-11 Time for refinement 9.515918e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.860206e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.860206e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.860206e-08 max(|| b_i - A x_i ||_1) 2.884300e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.624383e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.860206e-08 max(|| b_i - A x_i ||_1) 2.884300e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.624383e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.884300e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.624383e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.884300e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.624383e-01 (SUCCESS) Start 2070: mpi_dst_example_simple_lap_s_facto1_sched0_kway_tqrcpbegin Test #2090: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpbegin ..............***Timeout 383.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.897024e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.231644e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.766417e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.219703e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.037977e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.882697e-01 s Time to initialize coeftab 4.155270e-01 s Time to factorize 3.072834e+00 s ( 3.25 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.187960e-01 s - iteration 1 : total iteration time 0.239 s error 9.1047e-11 Time for refinement 5.025814e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.209517e-08 max(|| b_i - A x_i ||_1) 2.990139e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.757380e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.209517e-08 max(|| b_i - A x_i ||_1) 2.990139e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.757380e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.209517e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.209517e-08 max(|| b_i - A x_i ||_1) 2.990139e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.757380e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.990139e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.757380e-01 (SUCCESS) Start 2090: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpbegin Test #2094: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrcpbegin ...............***Timeout 383.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.306964e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.520519e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.377817e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.648868e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.755498e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.794822e-01 s Time to initialize coeftab 5.261527e-01 s Time to factorize 4.862383e+00 s ( 2.05 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.749962e-01 s - iteration 1 : total iteration time 0.351 s error 9.1294e-11 Time for refinement 8.019839e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.171960e-08 max(|| b_i - A x_i ||_1) 2.948495e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.705050e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.171960e-08 max(|| b_i - A x_i ||_1) 2.948495e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.705050e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.171960e-08 max(|| b_i - A x_i ||_1) 2.948495e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.705050e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.171960e-08 max(|| b_i - A x_i ||_1) 2.948495e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.705050e-01 (SUCCESS) Start 2094: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrcpbegin 1935/3626 Test #2176: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpilu0 ...............***Timeout 383.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.604747e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.414528e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.223382e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.059316e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.786725e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.983527e+00 s Time to initialize coeftab 1.308889e+00 s Time to factorize 8.398870e+00 s (638.07 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.378802e-01 s - iteration 1 : total iteration time 1.03 s error 7.8828e-15 Time for refinement 2.533782e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.886039e-15 max(|| b_i - A x_i ||_1) 1.267413e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.592614e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.886039e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.886039e-15 max(|| b_i - A x_i ||_1) 1.267413e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.592614e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.886039e-15 max(|| b_i - A x_i ||_1) 1.267413e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.592614e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.267413e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.592614e-02 (SUCCESS) Start 2176: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpilu0 1935/3626 Test #2177: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpilu1 ...............***Timeout 383.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.538104e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.288062e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.723375e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.843977e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.379585e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.237729e+00 s Time to initialize coeftab 6.785937e-01 s Time to factorize 1.514146e+01 s (353.93 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.215133e-01 s - iteration 1 : total iteration time 0.839 s error 7.8828e-15 Time for refinement 1.479286e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.886039e-15 max(|| b_i - A x_i ||_1) 1.267390e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.592584e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.886039e-15 max(|| b_i - A x_i ||_1) 1.267390e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.592584e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.886039e-15 max(|| b_i - A x_i ||_1) 1.267390e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.592584e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.886039e-15 max(|| b_i - A x_i ||_1) 1.267390e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.592584e-02 (SUCCESS) Start 2177: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpilu1 1935/3626 Test #2178: mpi_dst_example_simple_lap_d_facto2_sched0_not_svdbegin .................***Timeout 383.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.119282e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.615340e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.278178e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.438253e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.476163e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.535286e-01 s Time to initialize coeftab 6.996625e-01 s Time to factorize 4.021288e+00 s ( 2.48 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 5.567150e-02 s - iteration 1 : total iteration time 0.0458 s error 1.9555e-14 Time for refinement 1.063817e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.955658e-14 max(|| b_i - A x_i ||_1) 3.849368e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.837062e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.955658e-14 max(|| b_i - A x_i ||_1) 3.849368e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.837062e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.955658e-14 max(|| b_i - A x_i ||_1) 3.849368e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.837062e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.955658e-14 max(|| b_i - A x_i ||_1) 3.849368e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.837062e-02 (SUCCESS) Start 2178: mpi_dst_example_simple_lap_d_facto2_sched0_not_svdbegin 1935/3626 Test #2179: mpi_dst_example_simple_lap_d_facto2_sched0_not_svdend ...................***Timeout 382.99 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.134369e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.317890e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.224856e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.936542e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.336563e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.500245e-02 s Time to initialize coeftab 6.090226e-01 s Time to factorize 2.713602e+00 s ( 3.68 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.204440e-01 s - iteration 1 : total iteration time 0.306 s error 2.7001e-16 Time for refinement 5.654911e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.931919e-16 max(|| b_i - A x_i ||_1) 7.837858e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.848942e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.931919e-16 max(|| b_i - A x_i ||_1) 7.837858e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.848942e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.931919e-16 max(|| b_i - A x_i ||_1) 7.837858e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.848942e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.931919e-16 max(|| b_i - A x_i ||_1) 7.837858e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.848942e-04 (SUCCESS) Start 2179: mpi_dst_example_simple_lap_d_facto2_sched0_not_svdend 1935/3626 Test #2180: mpi_dst_example_simple_lap_d_facto2_sched0_kway_svdbegin ................***Timeout 382.98 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.668829e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.033098e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.549628e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.002258e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.407866e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.451410e+00 s Time to initialize coeftab 4.873464e-01 s Time to factorize 1.329222e+01 s (769.19 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.699143e-01 s - iteration 1 : total iteration time 0.315 s error 1.9555e-14 Time for refinement 1.088892e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.955754e-14 max(|| b_i - A x_i ||_1) 3.849616e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.837374e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.955754e-14 max(|| b_i - A x_i ||_1) 3.849616e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.837374e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.955754e-14 max(|| b_i - A x_i ||_1) 3.849616e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.837374e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.955754e-14 max(|| b_i - A x_i ||_1) 3.849616e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.837374e-02 (SUCCESS) Start 2180: mpi_dst_example_simple_lap_d_facto2_sched0_kway_svdbegin 1935/3626 Test #2182: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_svdbegin .....***Timeout 382.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.871730e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.184188e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.802415e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.459799e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.298649e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.250105e-01 s Time to initialize coeftab 7.388238e-01 s Time to factorize 2.528718e+01 s (404.32 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 7.158955e-01 s - iteration 1 : total iteration time 0.961 s error 1.9555e-14 Time for refinement 2.479550e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.955754e-14 max(|| b_i - A x_i ||_1) 3.849616e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.837374e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.955754e-14 max(|| b_i - A x_i ||_1) 3.849616e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.837374e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.955754e-14 max(|| b_i - A x_i ||_1) 3.849616e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.837374e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.955754e-14 max(|| b_i - A x_i ||_1) 3.849616e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.837374e-02 (SUCCESS) Start 2182: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_svdbegin 1935/3626 Test #2183: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_svdend .......***Timeout 382.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.832854e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.495118e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.071960e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.101776e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.017700e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.789545e-01 s Time to initialize coeftab 9.935284e-01 s Time to factorize 4.261388e+00 s ( 2.34 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.916505e-01 s - iteration 1 : total iteration time 0.374 s error 2.7001e-16 Time for refinement 8.927904e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.931919e-16 max(|| b_i - A x_i ||_1) 7.837858e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.848942e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.931919e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.931919e-16 max(|| b_i - A x_i ||_1) 7.837858e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.848942e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.837858e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.848942e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.931919e-16 max(|| b_i - A x_i ||_1) 7.837858e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.848942e-04 (SUCCESS) Start 2183: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_svdend 1935/3626 Test #2184: mpi_dst_example_simple_lap_d_facto2_sched0_not_pqrcpbegin ...............***Timeout 382.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.141944e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.873921e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.169842e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.494970e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.815784e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.880249e+00 s Time to initialize coeftab 8.852597e-01 s Time to factorize 2.514975e+01 s (406.53 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 7.527728e-01 s - iteration 1 : total iteration time 0.526 s error 1.5094e-14 Time for refinement 1.106870e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.509428e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.509428e-14 max(|| b_i - A x_i ||_1) 2.753758e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.460334e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.753758e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.460334e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.509428e-14 max(|| b_i - A x_i ||_1) 2.753758e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.460334e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.509428e-14 max(|| b_i - A x_i ||_1) 2.753758e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.460334e-02 (SUCCESS) Start 2184: mpi_dst_example_simple_lap_d_facto2_sched0_not_pqrcpbegin 1935/3626 Test #2185: mpi_dst_example_simple_lap_d_facto2_sched0_not_pqrcpend .................***Timeout 382.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.149455e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.862629e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.337776e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.411993e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.432526e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.542196e-01 s Time to initialize coeftab 3.389514e+00 s Time to factorize 1.246826e+00 s ( 8.01 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.204984e-01 s - iteration 1 : total iteration time 0.352 s error 2.7955e-16 Time for refinement 9.081351e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.020491e-16 max(|| b_i - A x_i ||_1) 7.810458e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.814511e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.020491e-16 max(|| b_i - A x_i ||_1) 7.810458e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.814511e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.020491e-16 max(|| b_i - A x_i ||_1) 7.810458e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.814511e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.020491e-16 max(|| b_i - A x_i ||_1) 7.810458e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.814511e-04 (SUCCESS) Start 2185: mpi_dst_example_simple_lap_d_facto2_sched0_not_pqrcpend 1935/3626 Test #2186: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpbegin ..............***Timeout 382.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.415170e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.530650e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.006169e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.644833e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.192040e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.387863e-01 s Time to initialize coeftab 7.876389e-01 s Time to factorize 1.419551e+01 s (720.24 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 7.118885e-01 s - iteration 1 : total iteration time 0.603 s error 1.5094e-14 Time for refinement 1.586048e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.509428e-14 max(|| b_i - A x_i ||_1) 2.753758e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.460334e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.509428e-14 max(|| b_i - A x_i ||_1) 2.753758e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.460334e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.509428e-14 max(|| b_i - A x_i ||_1) 2.753758e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.460334e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.509428e-14 max(|| b_i - A x_i ||_1) 2.753758e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.460334e-02 (SUCCESS) Start 2186: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpbegin 1935/3626 Test #2187: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpend ................***Timeout 382.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.748858e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.898362e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.874162e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.783834e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.998891e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.115085e-01 s Time to initialize coeftab 4.198821e-01 s Time to factorize 1.735991e+01 s (588.96 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.040605e-01 s - iteration 1 : total iteration time 0.381 s error 2.7955e-16 Time for refinement 7.505841e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.020393e-16 max(|| b_i - A x_i ||_1) 7.812611e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.817217e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.020393e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.020393e-16 max(|| b_i - A x_i ||_1) 7.812611e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.817217e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.812611e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.817217e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.020393e-16 max(|| b_i - A x_i ||_1) 7.812611e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.817217e-04 (SUCCESS) Start 2187: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpend 1935/3626 Test #2188: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpbegin ...***Timeout 382.84 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.530757e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.268157e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.901398e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.772855e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.013092e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.793499e-01 s Time to initialize coeftab 4.691212e-01 s Time to factorize 5.504429e+00 s ( 1.81 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 7.341274e-02 s - iteration 1 : total iteration time 0.0563 s error 1.5094e-14 Time for refinement 2.523754e-01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.509423e-14 max(|| b_i - A x_i ||_1) 2.753615e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.460154e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.509423e-14 max(|| b_i - A x_i ||_1) 2.753615e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.460154e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.509423e-14 max(|| b_i - A x_i ||_1) 2.753615e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.460154e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.509423e-14 max(|| b_i - A x_i ||_1) 2.753615e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.460154e-02 (SUCCESS) Start 2188: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpbegin 1935/3626 Test #2189: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpend .....***Timeout 382.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.706565e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.341818e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.449557e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.316159e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.457784e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.358660e-01 s Time to initialize coeftab 4.364931e-01 s Time to factorize 2.170204e+00 s ( 4.60 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 4.672211e-01 s - iteration 1 : total iteration time 0.612 s error 2.7955e-16 Time for refinement 1.199722e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.020393e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.020393e-16 max(|| b_i - A x_i ||_1) 7.812611e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.817217e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.812611e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.817217e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.020393e-16 max(|| b_i - A x_i ||_1) 7.812611e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.817217e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.020393e-16 max(|| b_i - A x_i ||_1) 7.812611e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.817217e-04 (SUCCESS) Start 2189: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpend 1935/3626 Test #2190: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrcpbegin ...............***Timeout 382.82 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.095716e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.005440e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.685573e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.889843e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.365934e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.764329e-01 s Time to initialize coeftab 1.586544e+00 s Time to factorize 1.306298e+01 s (782.69 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Time to solve 1.283925e-01 s - iteration 1 : total iteration time 0.108 s error 3.6705e-14 Time for refinement 3.143497e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670340e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670340e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670340e-14 max(|| b_i - A x_i ||_1) 6.796084e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.539864e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 6.796084e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.539864e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 6.796084e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.539864e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670340e-14 max(|| b_i - A x_i ||_1) 6.796084e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.539864e-02 (SUCCESS) Start 2190: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrcpbegin 1935/3626 Test #2191: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrcpend .................***Timeout 382.80 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.785362e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.189198e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.178934e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.722662e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.750725e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.356738e+00 s Time to initialize coeftab 1.946967e-01 s Time to factorize 6.838702e+00 s ( 1.46 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 4.184537e-01 s - iteration 1 : total iteration time 0.259 s error 2.8321e-16 Time for refinement 7.970731e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028938e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028938e-16 max(|| b_i - A x_i ||_1) 7.593883e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.542367e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.593883e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.542367e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028938e-16 max(|| b_i - A x_i ||_1) 7.593883e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.542367e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028938e-16 max(|| b_i - A x_i ||_1) 7.593883e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.542367e-04 (SUCCESS) Start 2191: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrcpend 1935/3626 Test #2192: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrcpbegin ..............***Timeout 382.79 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.935646e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.196017e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.736768e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.784155e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.793729e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.246489e-01 s Time to initialize coeftab 1.309184e+00 s Time to factorize 3.832395e+01 s (266.78 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Time to solve 2.946723e-01 s - iteration 1 : total iteration time 0.813 s error 3.6705e-14 Time for refinement 1.728144e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670339e-14 max(|| b_i - A x_i ||_1) 6.796129e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.539920e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670339e-14 max(|| b_i - A x_i ||_1) 6.796129e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.539920e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670339e-14 max(|| b_i - A x_i ||_1) 6.796129e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.539920e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670339e-14 max(|| b_i - A x_i ||_1) 6.796129e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.539920e-02 (SUCCESS) Start 2192: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrcpbegin 1935/3626 Test #2193: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrcpend ................***Timeout 382.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.280164e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.449977e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.451830e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.523845e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.485109e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.352348e-01 s Time to initialize coeftab 2.036871e-01 s Time to factorize 4.943074e+00 s ( 2.02 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.726125e-01 s - iteration 1 : total iteration time 0.702 s error 2.8321e-16 Time for refinement 1.443137e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.029029e-16 max(|| b_i - A x_i ||_1) 7.597759e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.547237e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.029029e-16 max(|| b_i - A x_i ||_1) 7.597759e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.547237e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.029029e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.029029e-16 max(|| b_i - A x_i ||_1) 7.597759e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.547237e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.597759e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.547237e-04 (SUCCESS) Start 2193: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrcpend 1935/3626 Test #2194: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpbegin ...***Timeout 382.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.818049e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.254904e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.146397e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.817027e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.569052e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.881923e-01 s Time to initialize coeftab 5.925074e-01 s Time to factorize 8.457903e+00 s ( 1.18 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Time to solve 4.482492e-01 s - iteration 1 : total iteration time 0.66 s error 3.6705e-14 Time for refinement 1.476130e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670340e-14 max(|| b_i - A x_i ||_1) 6.796084e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.539864e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670340e-14 max(|| b_i - A x_i ||_1) 6.796084e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.539864e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670340e-14 max(|| b_i - A x_i ||_1) 6.796084e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.539864e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670340e-14 max(|| b_i - A x_i ||_1) 6.796084e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.539864e-02 (SUCCESS) Start 2194: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpbegin 1935/3626 Test #2195: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpend .....***Timeout 382.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.549106e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.540678e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.529654e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.538618e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.904446e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.715875e-01 s Time to initialize coeftab 3.445158e-01 s Time to factorize 5.743009e+00 s ( 1.74 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.520909e-01 s - iteration 1 : total iteration time 0.421 s error 2.8321e-16 Time for refinement 9.583683e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028938e-16 max(|| b_i - A x_i ||_1) 7.593883e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.542367e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028938e-16 max(|| b_i - A x_i ||_1) 7.593883e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.542367e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028938e-16 max(|| b_i - A x_i ||_1) 7.593883e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.542367e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028938e-16 max(|| b_i - A x_i ||_1) 7.593883e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.542367e-04 (SUCCESS) Start 2195: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpend 1935/3626 Test #2196: mpi_dst_example_simple_lap_d_facto2_sched0_not_tqrcpbegin ...............***Timeout 382.71 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.700788e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.014907e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.974212e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.119548e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.698243e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.464293e-01 s Time to initialize coeftab 5.800295e-01 s Time to factorize 1.364947e+01 s (749.06 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Time to solve 5.340214e-01 s - iteration 1 : total iteration time 0.536 s error 3.6705e-14 Time for refinement 1.315185e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670177e-14 max(|| b_i - A x_i ||_1) 6.795185e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.538734e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670177e-14 max(|| b_i - A x_i ||_1) 6.795185e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.538734e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670177e-14 max(|| b_i - A x_i ||_1) 6.795185e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.538734e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670177e-14 max(|| b_i - A x_i ||_1) 6.795185e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.538734e-02 (SUCCESS) Start 2196: mpi_dst_example_simple_lap_d_facto2_sched0_not_tqrcpbegin 1935/3626 Test #2197: mpi_dst_example_simple_lap_d_facto2_sched0_not_tqrcpend .................***Timeout 382.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.499216e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.503346e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.647523e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.490179e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.210277e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.686517e-01 s Time to initialize coeftab 2.231122e-01 s Time to factorize 3.706938e+00 s ( 2.69 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 5.401839e-02 s - iteration 1 : total iteration time 0.0477 s error 2.8321e-16 Time for refinement 1.655247e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028813e-16 max(|| b_i - A x_i ||_1) 7.588626e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.535761e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028813e-16 max(|| b_i - A x_i ||_1) 7.588626e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.535761e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028813e-16 max(|| b_i - A x_i ||_1) 7.588626e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.535761e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028813e-16 max(|| b_i - A x_i ||_1) 7.588626e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.535761e-04 (SUCCESS) Start 2197: mpi_dst_example_simple_lap_d_facto2_sched0_not_tqrcpend 1935/3626 Test #2198: mpi_dst_example_simple_lap_d_facto2_sched0_kway_tqrcpbegin ..............***Timeout 382.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.219503e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.720295e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.769367e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.655254e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.575473e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.128358e-01 s Time to initialize coeftab 1.857389e+00 s Time to factorize 2.080532e+01 s (491.42 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Time to solve 3.587409e-01 s - iteration 1 : total iteration time 0.617 s error 3.6705e-14 Time for refinement 1.164110e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670177e-14 max(|| b_i - A x_i ||_1) 6.795185e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.538734e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670177e-14 max(|| b_i - A x_i ||_1) 6.795185e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.538734e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670177e-14 max(|| b_i - A x_i ||_1) 6.795185e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.538734e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670177e-14 max(|| b_i - A x_i ||_1) 6.795185e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.538734e-02 (SUCCESS) Start 2198: mpi_dst_example_simple_lap_d_facto2_sched0_kway_tqrcpbegin 1935/3626 Test #2199: mpi_dst_example_simple_lap_d_facto2_sched0_kway_tqrcpend ................***Timeout 382.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.804742e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.065211e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.371841e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.691671e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.201792e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.282833e-01 s Time to initialize coeftab 2.871816e-01 s Time to factorize 4.602121e+00 s ( 2.17 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.612364e-01 s - iteration 1 : total iteration time 0.469 s error 2.8321e-16 Time for refinement 9.097288e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.026881e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.026881e-16 max(|| b_i - A x_i ||_1) 7.574635e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.518180e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.574635e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.518180e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.026881e-16 max(|| b_i - A x_i ||_1) 7.574635e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.518180e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.026881e-16 max(|| b_i - A x_i ||_1) 7.574635e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.518180e-04 (SUCCESS) Start 2199: mpi_dst_example_simple_lap_d_facto2_sched0_kway_tqrcpend 1935/3626 Test #2200: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpbegin ...***Timeout 382.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.542648e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.225409e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.649292e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.903140e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.262692e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.342898e+00 s Time to initialize coeftab 1.314340e+00 s Time to factorize 2.508538e+01 s (407.58 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Time to solve 4.342996e-01 s - iteration 1 : total iteration time 0.631 s error 3.6705e-14 Time for refinement 1.272852e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670177e-14 max(|| b_i - A x_i ||_1) 6.795185e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.538734e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670177e-14 max(|| b_i - A x_i ||_1) 6.795185e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.538734e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670177e-14 max(|| b_i - A x_i ||_1) 6.795185e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.538734e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670177e-14 max(|| b_i - A x_i ||_1) 6.795185e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.538734e-02 (SUCCESS) Start 2200: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpbegin 1935/3626 Test #2201: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpend .....***Timeout 382.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.055111e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.698320e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.208683e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.461463e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.391678e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.652222e-01 s Time to initialize coeftab 3.490528e-01 s Time to factorize 4.505576e+00 s ( 2.22 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.839898e-01 s - iteration 1 : total iteration time 0.333 s error 2.8321e-16 Time for refinement 6.507486e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028813e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028813e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028813e-16 max(|| b_i - A x_i ||_1) 7.588626e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.535761e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.588626e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.535761e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028813e-16 max(|| b_i - A x_i ||_1) 7.588626e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.535761e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.588626e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.535761e-04 (SUCCESS) Start 2201: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpend 1935/3626 Test #2202: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrrtbegin ...............***Timeout 382.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.726968e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.804695e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.910102e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.536866e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.360214e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.100215e+00 s Time to initialize coeftab 1.647185e+00 s Time to factorize 8.604861e+00 s ( 1.16 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 8.419988e-02 s - iteration 1 : total iteration time 0.0418 s error 6.4092e-13 Time for refinement 1.747091e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.409227e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.409227e-13 max(|| b_i - A x_i ||_1) 1.271627e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.597908e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.409227e-13 max(|| b_i - A x_i ||_1) 1.271627e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.597908e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.271627e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.597908e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.409227e-13 max(|| b_i - A x_i ||_1) 1.271627e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.597908e+00 (SUCCESS) Start 2202: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrrtbegin 1935/3626 Test #2203: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrrtend .................***Timeout 382.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.129154e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.501744e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.389807e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.737041e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.011186e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.415176e+00 s Time to initialize coeftab 6.979137e-01 s Time to factorize 1.560899e+01 s (655.02 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 7.319871e-01 s - iteration 1 : total iteration time 1.75 s error 3.7786e-14 Time for refinement 3.075139e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.778235e-14 max(|| b_i - A x_i ||_1) 4.051854e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.091503e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.778235e-14 max(|| b_i - A x_i ||_1) 4.051854e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.091503e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.778235e-14 max(|| b_i - A x_i ||_1) 4.051854e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.091503e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.778235e-14 max(|| b_i - A x_i ||_1) 4.051854e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.091503e-02 (SUCCESS) Start 2203: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrrtend 1935/3626 Test #2204: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrrtbegin ..............***Timeout 382.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.095518e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.868735e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.193356e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.752684e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.048518e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.352929e-02 s Time to initialize coeftab 2.484330e-01 s Time to factorize 6.035105e+00 s ( 1.65 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.548900e-01 s - iteration 1 : total iteration time 0.275 s error 6.4092e-13 Time for refinement 4.673534e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.409227e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.409227e-13 max(|| b_i - A x_i ||_1) 1.271627e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.597908e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.271627e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.597908e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.409227e-13 max(|| b_i - A x_i ||_1) 1.271627e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.597908e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.409227e-13 max(|| b_i - A x_i ||_1) 1.271627e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.597908e+00 (SUCCESS) Start 2204: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrrtbegin 1935/3626 Test #2205: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrrtend ................***Timeout 382.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.240704e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.376737e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.400161e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.269026e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.193901e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.474605e+00 s Time to initialize coeftab 3.178109e-01 s Time to factorize 8.267095e+00 s ( 1.21 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.663423e-01 s - iteration 1 : total iteration time 0.748 s error 3.7786e-14 Time for refinement 1.749613e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.778235e-14 max(|| b_i - A x_i ||_1) 4.051854e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.091503e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.778235e-14 max(|| b_i - A x_i ||_1) 4.051854e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.091503e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.778235e-14 max(|| b_i - A x_i ||_1) 4.051854e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.091503e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.778235e-14 max(|| b_i - A x_i ||_1) 4.051854e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.091503e-02 (SUCCESS) Start 2205: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrrtend 1935/3626 Test #2206: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtbegin ...***Timeout 382.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.908739e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.275468e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.156037e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.401229e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.454797e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.516526e-01 s Time to initialize coeftab 2.963753e+00 s Time to factorize 1.301319e+01 s (785.68 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.985831e-01 s - iteration 1 : total iteration time 0.25 s error 6.4092e-13 Time for refinement 5.836796e-01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.409226e-13 max(|| b_i - A x_i ||_1) 1.271626e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.597907e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.409226e-13 max(|| b_i - A x_i ||_1) 1.271626e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.597907e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.409226e-13 max(|| b_i - A x_i ||_1) 1.271626e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.597907e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.409226e-13 max(|| b_i - A x_i ||_1) 1.271626e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.597907e+00 (SUCCESS) Start 2206: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtbegin 1935/3626 Test #2207: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtend .....***Timeout 382.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.561797e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.457384e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.046846e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.105660e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.333697e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.497577e-01 s Time to initialize coeftab 5.176161e-01 s Time to factorize 5.032866e+00 s ( 1.98 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.803524e-01 s - iteration 1 : total iteration time 0.312 s error 3.7786e-14 Time for refinement 7.789222e-01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.778283e-14 max(|| b_i - A x_i ||_1) 4.052050e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.091749e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.778283e-14 max(|| b_i - A x_i ||_1) 4.052050e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.091749e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.778283e-14 max(|| b_i - A x_i ||_1) 4.052050e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.091749e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.778283e-14 max(|| b_i - A x_i ||_1) 4.052050e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.091749e-02 (SUCCESS) Start 2207: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtend 1935/3626 Test #2208: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpilu0 ...............***Timeout 382.52 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.125970e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.871871e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.919664e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.401677e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.420717e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.609682e-01 s Time to initialize coeftab 3.971200e-01 s Time to factorize 4.111927e+00 s ( 2.43 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.499244e-01 s - iteration 1 : total iteration time 0.225 s error 7.7807e-15 Time for refinement 5.084719e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.777982e-15 max(|| b_i - A x_i ||_1) 1.266018e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.590861e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.777982e-15 max(|| b_i - A x_i ||_1) 1.266018e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.590861e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.777982e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.777982e-15 max(|| b_i - A x_i ||_1) 1.266018e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.590861e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.266018e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.590861e-02 (SUCCESS) Start 2208: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpilu0 1935/3626 Test #2209: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpilu1 ...............***Timeout 382.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.971852e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.345273e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.262508e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.311331e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.222824e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.260203e-01 s Time to initialize coeftab 7.889280e-01 s Time to factorize 2.428570e+00 s ( 4.11 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.493432e-01 s - iteration 1 : total iteration time 0.0753 s error 7.7807e-15 Time for refinement 2.480369e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.777982e-15 max(|| b_i - A x_i ||_1) 1.266018e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.590861e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.777982e-15 max(|| b_i - A x_i ||_1) 1.266018e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.590861e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.777982e-15 max(|| b_i - A x_i ||_1) 1.266018e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.590861e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.777982e-15 max(|| b_i - A x_i ||_1) 1.266018e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.590861e-02 (SUCCESS) Start 2209: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpilu1 1935/3626 Test #2210: mpi_dst_example_simple_lap_c_facto0_sched0_not_svdbegin .................***Timeout 382.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.217027e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.545949e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.228937e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.629721e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.502029e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.120945e-01 s Time to initialize coeftab 8.498466e-01 s Time to factorize 1.477238e+01 s ( 1.37 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.880188e-01 s Time for refinement 1.572491e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.989540e-07 max(|| b_i - A x_i ||_1) 9.269723e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.339073e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.989540e-07 max(|| b_i - A x_i ||_1) 9.269723e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.339073e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.989540e-07 max(|| b_i - A x_i ||_1) 9.269723e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.339073e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.989540e-07 max(|| b_i - A x_i ||_1) 9.269723e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.339073e+00 (SUCCESS) Start 2210: mpi_dst_example_simple_lap_c_facto0_sched0_not_svdbegin 1935/3626 Test #2211: mpi_dst_example_simple_lap_c_facto0_sched0_not_svdend ...................***Timeout 382.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.168216e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.942782e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.461058e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.433338e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.392769e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.598710e-01 s Time to initialize coeftab 4.100842e-01 s Time to factorize 1.825121e+01 s ( 1.11 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.382483e-01 s Time for refinement 1.085733e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.890346e-07 max(|| b_i - A x_i ||_1) 8.664751e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.186418e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.890346e-07 max(|| b_i - A x_i ||_1) 8.664751e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.186418e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.890346e-07 max(|| b_i - A x_i ||_1) 8.664751e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.186418e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.890346e-07 max(|| b_i - A x_i ||_1) 8.664751e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.186418e+00 (SUCCESS) Start 2211: mpi_dst_example_simple_lap_c_facto0_sched0_not_svdend 1935/3626 Test #2212: mpi_dst_example_simple_lap_c_facto0_sched0_kway_svdbegin ................***Timeout 382.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.000352e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.038761e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.898029e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.971212e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.526259e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.665862e-01 s Time to initialize coeftab 3.917203e-01 s Time to factorize 2.469191e+01 s (841.07 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.540073e-01 s Time for refinement 6.050857e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.990082e-07 max(|| b_i - A x_i ||_1) 9.269298e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.338966e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.990082e-07 max(|| b_i - A x_i ||_1) 9.269298e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.338966e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.990082e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.990082e-07 max(|| b_i - A x_i ||_1) 9.269298e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.338966e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.269298e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.338966e+00 (SUCCESS) Start 2212: mpi_dst_example_simple_lap_c_facto0_sched0_kway_svdbegin 1935/3626 Test #2213: mpi_dst_example_simple_lap_c_facto0_sched0_kway_svdend ..................***Timeout 382.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.101557e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.603154e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.031033e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.237084e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.364931e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.620171e-01 s Time to initialize coeftab 5.349496e-01 s Time to factorize 1.255205e+01 s ( 1.62 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.469574e-03 s Time for refinement 6.413783e-03 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) Start 2213: mpi_dst_example_simple_lap_c_facto0_sched0_kway_svdend 1935/3626 Test #2214: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_svdbegin .....***Timeout 382.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.833645e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.338268e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.572629e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.522200e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.399986e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.064268e-01 s Time to initialize coeftab 7.253017e-01 s Time to factorize 1.050439e+01 s ( 1.93 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.395763e-02 s Time for refinement 1.169997e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.989540e-07 max(|| b_i - A x_i ||_1) 9.269723e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.339073e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.989540e-07 max(|| b_i - A x_i ||_1) 9.269723e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.339073e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.989540e-07 max(|| b_i - A x_i ||_1) 9.269723e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.339073e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.989540e-07 max(|| b_i - A x_i ||_1) 9.269723e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.339073e+00 (SUCCESS) Start 2214: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_svdbegin 1935/3626 Test #2215: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_svdend .......***Timeout 382.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.919634e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.360850e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.565227e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.640158e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.132431e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.918087e-01 s Time to initialize coeftab 4.698353e-01 s Time to factorize 2.062235e+01 s (1007.04 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.740168e-02 s Time for refinement 4.387938e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.890346e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.890346e-07 max(|| b_i - A x_i ||_1) 8.664751e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.186418e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.890346e-07 max(|| b_i - A x_i ||_1) 8.664751e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.186418e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.664751e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.186418e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.890346e-07 max(|| b_i - A x_i ||_1) 8.664751e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.186418e+00 (SUCCESS) Start 2215: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_svdend 1935/3626 Test #2216: mpi_dst_example_simple_lap_c_facto0_sched0_not_pqrcpbegin ...............***Timeout 382.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.198170e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.894755e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.223631e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.137622e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.346328e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.487604e-01 s Time to initialize coeftab 1.087873e+00 s Time to factorize 7.373138e+00 s ( 2.75 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.385999e-01 s - iteration 1 : total iteration time 0.255 s error 6.309e-11 Time for refinement 6.026946e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) Start 2216: mpi_dst_example_simple_lap_c_facto0_sched0_not_pqrcpbegin 1935/3626 Test #2217: mpi_dst_example_simple_lap_c_facto0_sched0_not_pqrcpend .................***Timeout 382.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.998852e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.393644e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.333554e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.069600e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.168168e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.990702e-01 s Time to initialize coeftab 4.977726e-01 s Time to factorize 2.096136e+00 s ( 9.68 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.828196e-02 s Time for refinement 5.998983e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.584095e-07 max(|| b_i - A x_i ||_1) 9.330522e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.354415e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.584095e-07 max(|| b_i - A x_i ||_1) 9.330522e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.354415e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.584095e-07 max(|| b_i - A x_i ||_1) 9.330522e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.354415e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.584095e-07 max(|| b_i - A x_i ||_1) 9.330522e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.354415e+00 (SUCCESS) Start 2217: mpi_dst_example_simple_lap_c_facto0_sched0_not_pqrcpend 1935/3626 Test #2218: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpbegin ..............***Timeout 382.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.934874e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.397106e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.744383e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.355166e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.651681e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.066351e+00 s Time to initialize coeftab 3.306134e-01 s Time to factorize 1.549228e+01 s ( 1.31 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.063814e-01 s - iteration 1 : total iteration time 0.331 s error 6.311e-11 Time for refinement 6.619570e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) Start 2218: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpbegin 1935/3626 Test #2219: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpend ................***Timeout 382.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.647680e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.202627e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.032114e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.735516e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.195485e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.161277e+00 s Time to initialize coeftab 3.150002e-01 s Time to factorize 9.025967e+00 s ( 2.25 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.839987e-01 s Time for refinement 9.139811e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582568e-07 max(|| b_i - A x_i ||_1) 9.326536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.353409e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582568e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582568e-07 max(|| b_i - A x_i ||_1) 9.326536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.353409e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.326536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.353409e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582568e-07 max(|| b_i - A x_i ||_1) 9.326536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.353409e+00 (SUCCESS) Start 2219: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpend 1935/3626 Test #2220: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpbegin ...***Timeout 382.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.225009e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.462672e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.635300e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.131153e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.236556e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.840383e-01 s Time to initialize coeftab 4.984304e-01 s Time to factorize 2.318658e+01 s (895.67 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.729490e-02 s - iteration 1 : total iteration time 0.0109 s error 6.311e-11 Time for refinement 2.548742e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) Start 2220: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpbegin 1935/3626 Test #2221: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpend .....***Timeout 382.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.547545e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.990373e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.252637e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.312629e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.751516e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.978993e-01 s Time to initialize coeftab 2.932796e-01 s Time to factorize 6.953996e+00 s ( 2.92 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.189875e-01 s Time for refinement 3.600994e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582568e-07 max(|| b_i - A x_i ||_1) 9.326536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.353409e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582568e-07 max(|| b_i - A x_i ||_1) 9.326536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.353409e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582568e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582568e-07 max(|| b_i - A x_i ||_1) 9.326536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.353409e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.326536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.353409e+00 (SUCCESS) Start 2221: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpend 1935/3626 Test #2222: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrcpbegin ...............***Timeout 382.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.965247e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.181274e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.500558e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.001956e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.098360e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.002071e-02 s Time to initialize coeftab 7.054395e-01 s Time to factorize 3.371885e+00 s ( 6.01 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.849009e-03 s - iteration 1 : total iteration time 0.00789 s error 6.3329e-11 Time for refinement 1.577980e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607526e-08 max(|| b_i - A x_i ||_1) 3.369537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.502515e-01 (SUCCESS) || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607526e-08 max(|| b_i - A x_i ||_1) 3.369537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.502515e-01 (SUCCESS) || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607526e-08 max(|| b_i - A x_i ||_1) 3.369537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.502515e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607526e-08 max(|| b_i - A x_i ||_1) 3.369537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.502515e-01 (SUCCESS) Start 2222: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrcpbegin 1935/3626 Test #2223: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrcpend .................***Timeout 382.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.353862e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.142937e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.670372e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.175658e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.701910e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.588020e-01 s Time to initialize coeftab 5.767638e-01 s Time to factorize 1.446646e+01 s ( 1.40 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.755948e-01 s Time for refinement 6.714089e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) Start 2223: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrcpend 1935/3626 Test #2224: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrcpbegin ..............***Timeout 382.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.405715e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.971364e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.461315e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.610511e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.673211e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.655791e-01 s Time to initialize coeftab 9.120435e-01 s Time to factorize 4.232857e+00 s ( 4.79 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.140500e-02 s - iteration 1 : total iteration time 0.0239 s error 6.3329e-11 Time for refinement 4.201003e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607526e-08 max(|| b_i - A x_i ||_1) 3.369537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.502515e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607526e-08 max(|| b_i - A x_i ||_1) 3.369537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.502515e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607526e-08 max(|| b_i - A x_i ||_1) 3.369537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.502515e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607526e-08 max(|| b_i - A x_i ||_1) 3.369537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.502515e-01 (SUCCESS) Start 2224: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrcpbegin 1935/3626 Test #2225: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrcpend ................***Timeout 382.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.216148e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.222776e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.042224e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.896399e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.809202e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.195959e+00 s Time to initialize coeftab 2.485470e-01 s Time to factorize 2.055563e+01 s (1010.31 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.091985e-01 s Time for refinement 7.253766e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) Start 2225: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrcpend 1935/3626 Test #2226: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpbegin ...***Timeout 382.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.034478e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.575135e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.169105e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.087672e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.351950e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.497761e-01 s Time to initialize coeftab 1.192393e+00 s Time to factorize 2.836987e+01 s (732.03 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.079621e-01 s - iteration 1 : total iteration time 0.667 s error 6.3304e-11 Time for refinement 1.440810e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.618122e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.618122e-08 max(|| b_i - A x_i ||_1) 3.380595e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.530418e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.380595e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.530418e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.618122e-08 max(|| b_i - A x_i ||_1) 3.380595e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.530418e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.618122e-08 max(|| b_i - A x_i ||_1) 3.380595e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.530418e-01 (SUCCESS) Start 2226: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpbegin 1935/3626 Test #2227: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpend .....***Timeout 382.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 [arch-nspawn-3655178:1054656] *** Process received signal *** [arch-nspawn-3655178:1054656] Signal: Segmentation fault (11) [arch-nspawn-3655178:1054656] Signal code: Address not mapped (1) [arch-nspawn-3655178:1054656] Failing at address: 0x7f5a5c1a1860 [arch-nspawn-3655178:1054656] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7f26534e86cc] [arch-nspawn-3655178:1054656] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7f265140aa02] [arch-nspawn-3655178:1054656] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7f265140b504] [arch-nspawn-3655178:1054656] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7f26513bca7a] [arch-nspawn-3655178:1054656] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7f26513e9aa2] [arch-nspawn-3655178:1054656] [ 5] /usr/lib/libmpi.so.40(+0x7de1a) [0x7f2651c7de1a] [arch-nspawn-3655178:1054656] [ 6] /usr/lib/libmpi.so.40(ompi_request_default_wait+0x1a) [0x7f2651c8019c] [arch-nspawn-3655178:1054656] [ 7] /usr/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0x98) [0x7f2651cf03e8] [arch-nspawn-3655178:1054656] [ 8] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_recursivedoubling+0x210) [0x7f2651cf1a88] [arch-nspawn-3655178:1054656] [ 9] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_ring+0x3fc) [0x7f2651cf443c] [arch-nspawn-3655178:1054656] [10] /usr/lib/libmpi.so.40(ompi_coll_tuned_allreduce_intra_dec_fixed+0x40) [0x7f2651d15152] [arch-nspawn-3655178:1054656] [11] /usr/lib/libmpi.so.40(MPI_Allreduce+0x294) [0x7f2651c8e584] [arch-nspawn-3655178:1054656] [12] /build/pastix/src/build/spm/src/libspm.so.1(spmUpdateComputedFields+0x140) [0x7f2651f3f458] [arch-nspawn-3655178:1054656] [13] /build/pastix/src/build/spm/src/libspm.so.1(genLaplacian+0xaa) [0x7f2651f4821e] [arch-nspawn-3655178:1054656] [14] /build/pastix/src/build/spm/src/libspm.so.1(+0x409c8) [0x7f2651f499c8] [arch-nspawn-3655178:1054656] [15] ./simple(+0xe2c) [0x555555556e2c] [arch-nspawn-3655178:1054656] [16] /usr/lib/libc.so.6(+0x27fae) [0x7f2651aa4fae] [arch-nspawn-3655178:1054656] [17] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7f2651aa50b8] [arch-nspawn-3655178:1054656] [18] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:1054656] *** End of error message *** -------------------------------------------------------------------------- prte noticed that process rank 3 with PID 1054656 on node arch-nspawn-3655178 exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- Start 2227: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpend 1935/3626 Test #2228: mpi_dst_example_simple_lap_c_facto0_sched0_not_tqrcpbegin ...............***Timeout 382.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.754198e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.300918e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.784508e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.374146e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.942189e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.658237e-01 s Time to initialize coeftab 9.648295e-01 s Time to factorize 2.114057e+01 s (982.36 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.853718e-01 s - iteration 1 : total iteration time 0.294 s error 6.2995e-11 Time for refinement 6.418533e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.563754e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.563754e-08 max(|| b_i - A x_i ||_1) 3.344537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.439431e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.344537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.439431e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.563754e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.563754e-08 max(|| b_i - A x_i ||_1) 3.344537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.439431e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.344537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.439431e-01 (SUCCESS) Start 2228: mpi_dst_example_simple_lap_c_facto0_sched0_not_tqrcpbegin 1935/3626 Test #2229: mpi_dst_example_simple_lap_c_facto0_sched0_not_tqrcpend .................***Timeout 382.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.046509e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.004880e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.394277e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.907656e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.366903e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.791206e-02 s Time to initialize coeftab 4.545576e-01 s Time to factorize 1.135606e+00 s (17.86 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.564545e-02 s Time for refinement 3.511757e-02 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.588291e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.588291e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.588291e-07 max(|| b_i - A x_i ||_1) 9.348086e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.358847e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.588291e-07 max(|| b_i - A x_i ||_1) 9.348086e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.358847e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.348086e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.358847e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.348086e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.358847e+00 (SUCCESS) Start 2229: mpi_dst_example_simple_lap_c_facto0_sched0_not_tqrcpend 1935/3626 Test #2230: mpi_dst_example_simple_lap_c_facto0_sched0_kway_tqrcpbegin ..............***Timeout 382.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.015328e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.253286e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.543349e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.950827e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.319391e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.490858e-02 s Time to initialize coeftab 6.380787e-01 s Time to factorize 2.782077e+00 s ( 7.29 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.850373e-02 s - iteration 1 : total iteration time 0.0144 s error 6.2991e-11 Time for refinement 3.073674e-02 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.536904e-08 max(|| b_i - A x_i ||_1) 3.333061e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.410473e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.536904e-08 max(|| b_i - A x_i ||_1) 3.333061e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.410473e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.536904e-08 max(|| b_i - A x_i ||_1) 3.333061e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.410473e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.536904e-08 max(|| b_i - A x_i ||_1) 3.333061e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.410473e-01 (SUCCESS) Start 2230: mpi_dst_example_simple_lap_c_facto0_sched0_kway_tqrcpbegin 1935/3626 Test #2231: mpi_dst_example_simple_lap_c_facto0_sched0_kway_tqrcpend ................***Timeout 382.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.181579e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.444788e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.234399e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.715540e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.307134e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.665885e-02 s Time to initialize coeftab 7.147674e-01 s Time to factorize 3.424973e+00 s ( 5.92 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.701618e-02 s Time for refinement 3.516422e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.588291e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.588291e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.588291e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.588291e-07 max(|| b_i - A x_i ||_1) 9.348086e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.358847e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.348086e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.358847e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.348086e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.358847e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.348086e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.358847e+00 (SUCCESS) Start 2231: mpi_dst_example_simple_lap_c_facto0_sched0_kway_tqrcpend 1935/3626 Test #2233: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpend .....***Timeout 382.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.058355e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.172725e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.697279e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.343069e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.331863e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.983624e-01 s Time to initialize coeftab 4.926484e-01 s Time to factorize 5.945676e+00 s ( 3.41 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.159823e-01 s Time for refinement 1.747003e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.588291e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.588291e-07 max(|| b_i - A x_i ||_1) 9.348086e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.358847e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.588291e-07 max(|| b_i - A x_i ||_1) 9.348086e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.358847e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.348086e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.358847e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.588291e-07 max(|| b_i - A x_i ||_1) 9.348086e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.358847e+00 (SUCCESS) Start 2233: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpend 1935/3626 Test #2234: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrrtbegin ...............***Timeout 382.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.247298e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.987976e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.777549e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.992457e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.350426e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.842815e-01 s Time to initialize coeftab 1.060849e+00 s Time to factorize 4.002732e+00 s ( 5.07 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.777730e-02 s - iteration 1 : total iteration time 0.0504 s error 6.2913e-11 Time for refinement 1.080068e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607712e-08 max(|| b_i - A x_i ||_1) 3.370653e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.505331e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607712e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607712e-08 max(|| b_i - A x_i ||_1) 3.370653e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.505331e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607712e-08 max(|| b_i - A x_i ||_1) 3.370653e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.505331e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.370653e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.505331e-01 (SUCCESS) Start 2234: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrrtbegin 1935/3626 Test #2235: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrrtend .................***Timeout 382.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.720366e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.248456e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.750737e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.161537e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.827434e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.113391e-01 s Time to initialize coeftab 3.154889e+00 s Time to factorize 4.314380e+00 s ( 4.70 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.627442e-01 s Time for refinement 7.819464e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582207e-07 max(|| b_i - A x_i ||_1) 9.316616e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.350906e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582207e-07 max(|| b_i - A x_i ||_1) 9.316616e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.350906e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582207e-07 max(|| b_i - A x_i ||_1) 9.316616e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.350906e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582207e-07 max(|| b_i - A x_i ||_1) 9.316616e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.350906e+00 (SUCCESS) Start 2235: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrrtend 1935/3626 Test #2236: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrrtbegin ..............***Timeout 382.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.832866e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.929382e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.845881e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.846924e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.523929e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.967096e-01 s Time to initialize coeftab 1.137322e+00 s Time to factorize 7.097286e+00 s ( 2.86 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.628243e-01 s - iteration 1 : total iteration time 0.14 s error 6.2999e-11 Time for refinement 3.111379e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607837e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607837e-08 max(|| b_i - A x_i ||_1) 3.370617e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.505239e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.370617e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.505239e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607837e-08 max(|| b_i - A x_i ||_1) 3.370617e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.505239e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607837e-08 max(|| b_i - A x_i ||_1) 3.370617e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.505239e-01 (SUCCESS) Start 2236: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrrtbegin 1935/3626 Test #2238: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtbegin ...***Timeout 382.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.738729e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.578025e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.521454e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.842927e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.271662e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.693390e-01 s Time to initialize coeftab 2.513020e+00 s Time to factorize 1.234173e+01 s ( 1.64 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.203232e-01 s - iteration 1 : total iteration time 0.306 s error 6.2913e-11 Time for refinement 6.422319e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607712e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607712e-08 max(|| b_i - A x_i ||_1) 3.370653e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.505331e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607712e-08 max(|| b_i - A x_i ||_1) 3.370653e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.505331e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607712e-08 max(|| b_i - A x_i ||_1) 3.370653e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.505331e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.370653e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.505331e-01 (SUCCESS) Start 2238: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtbegin 1935/3626 Test #2239: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtend .....***Timeout 382.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.765778e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.602936e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.976238e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.553866e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.651186e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.384801e+00 s Time to initialize coeftab 2.571162e+00 s Time to factorize 1.823603e+01 s ( 1.11 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.543930e-01 s Time for refinement 2.824727e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.579101e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.579101e-07 max(|| b_i - A x_i ||_1) 9.306854e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.348443e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.306854e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.348443e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.579101e-07 max(|| b_i - A x_i ||_1) 9.306854e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.348443e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.579101e-07 max(|| b_i - A x_i ||_1) 9.306854e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.348443e+00 (SUCCESS) Start 2239: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtend 1935/3626 Test #2240: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpilu0 ...............***Timeout 382.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.526930e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.995892e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.471574e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.626236e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.728676e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.055024e-01 s Time to initialize coeftab 2.109974e-01 s Time to factorize 1.907800e+01 s ( 1.06 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2240: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpilu0 1935/3626 Test #2241: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpilu1 ...............***Timeout 382.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.659260e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.215895e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.448626e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.828922e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.872150e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.333523e-02 s Time to initialize coeftab 6.073558e-01 s Time to factorize 2.050976e+00 s ( 9.89 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.225403e-02 s - iteration 1 : total iteration time 0.0125 s error 3.0399e-11 Time for refinement 2.883859e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.424494e-08 max(|| b_i - A x_i ||_1) 3.236631e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.167145e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.424494e-08 max(|| b_i - A x_i ||_1) 3.236631e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.167145e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.424494e-08 max(|| b_i - A x_i ||_1) 3.236631e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.167145e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.424494e-08 max(|| b_i - A x_i ||_1) 3.236631e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.167145e-01 (SUCCESS) Start 2241: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpilu1 1935/3626 Test #2243: mpi_dst_example_simple_lap_c_facto1_sched0_not_svdend ...................***Timeout 382.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.918500e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.459010e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.546695e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.785440e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.350544e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.591116e-01 s Time to initialize coeftab 5.170158e-01 s Time to factorize 2.057112e+01 s ( 1.04 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.820268e-01 s Time for refinement 5.385603e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) Start 2243: mpi_dst_example_simple_lap_c_facto1_sched0_not_svdend 1935/3626 Test #2245: mpi_dst_example_simple_lap_c_facto1_sched0_kway_svdend ..................***Timeout 382.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.335034e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.998458e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.703178e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.663244e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.841872e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.873372e-01 s Time to initialize coeftab 5.418577e-01 s Time to factorize 2.817393e+01 s (774.45 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.148677e-02 s Time for refinement 1.368848e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812972e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812972e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812972e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812972e-07 max(|| b_i - A x_i ||_1) 7.923540e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999385e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 7.923540e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999385e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 7.923540e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999385e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 7.923540e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999385e+00 (SUCCESS) Start 2245: mpi_dst_example_simple_lap_c_facto1_sched0_kway_svdend 1935/3626 Test #2246: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_svdbegin .....***Timeout 382.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.033904e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.952194e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.105169e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.150746e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.258723e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.903396e-01 s Time to initialize coeftab 3.603187e-01 s Time to factorize 1.504008e+01 s ( 1.42 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.900327e-01 s Time for refinement 2.428966e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.937753e-07 max(|| b_i - A x_i ||_1) 8.680898e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190493e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.937753e-07 max(|| b_i - A x_i ||_1) 8.680898e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190493e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.937753e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.937753e-07 max(|| b_i - A x_i ||_1) 8.680898e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190493e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.680898e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190493e+00 (SUCCESS) Start 2246: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_svdbegin 1935/3626 Test #2247: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_svdend .......***Timeout 382.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.807211e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.005768e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.594368e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.548051e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.068392e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.043015e+00 s Time to initialize coeftab 3.281123e+00 s Time to factorize 1.934919e+01 s ( 1.10 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.546284e-01 s Time for refinement 1.818604e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812972e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812972e-07 max(|| b_i - A x_i ||_1) 7.923540e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999385e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 7.923540e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999385e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812972e-07 max(|| b_i - A x_i ||_1) 7.923540e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999385e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.812972e-07 max(|| b_i - A x_i ||_1) 7.923540e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999385e+00 (SUCCESS) Start 2247: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_svdend 1935/3626 Test #2248: mpi_dst_example_simple_lap_c_facto1_sched0_not_pqrcpbegin ...............***Timeout 382.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.294383e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.456684e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.128724e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.966687e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.904975e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.245670e+00 s Time to initialize coeftab 2.864367e-01 s Time to factorize 6.805908e+01 s (320.59 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.069890e-01 s - iteration 1 : total iteration time 0.807 s error 6.3017e-11 Time for refinement 1.466986e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) Start 2248: mpi_dst_example_simple_lap_c_facto1_sched0_not_pqrcpbegin 1935/3626 Test #2249: mpi_dst_example_simple_lap_c_facto1_sched0_not_pqrcpend .................***Timeout 382.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.506192e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.790948e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.835485e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.720109e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.239971e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.046125e+00 s Time to initialize coeftab 2.367958e-01 s Time to factorize 3.454534e+00 s ( 6.17 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.954439e-01 s Time for refinement 4.180143e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515058e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515058e-07 max(|| b_i - A x_i ||_1) 8.568539e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162140e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.568539e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162140e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515058e-07 max(|| b_i - A x_i ||_1) 8.568539e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162140e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515058e-07 max(|| b_i - A x_i ||_1) 8.568539e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162140e+00 (SUCCESS) Start 2249: mpi_dst_example_simple_lap_c_facto1_sched0_not_pqrcpend 1935/3626 Test #2250: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpbegin ..............***Timeout 382.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.915097e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.523299e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.012072e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.553782e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.241801e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.535963e-01 s Time to initialize coeftab 5.813067e-01 s Time to factorize 3.460455e+01 s (630.53 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.823965e-02 s - iteration 1 : total iteration time 0.0543 s error 6.302e-11 Time for refinement 1.115472e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.508534e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.508534e-08 max(|| b_i - A x_i ||_1) 3.183547e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.033196e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.508534e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.508534e-08 max(|| b_i - A x_i ||_1) 3.183547e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.033196e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.183547e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.033196e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.183547e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.033196e-01 (SUCCESS) Start 2250: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpbegin 1935/3626 Test #2251: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpend ................***Timeout 382.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.120367e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.924893e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.186514e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.822516e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.540966e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.642614e-01 s Time to initialize coeftab 6.761196e-01 s Time to factorize 7.373508e+00 s ( 2.89 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.808130e-01 s Time for refinement 8.935951e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515204e-07 max(|| b_i - A x_i ||_1) 8.568798e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162206e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515204e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515204e-07 max(|| b_i - A x_i ||_1) 8.568798e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162206e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515204e-07 max(|| b_i - A x_i ||_1) 8.568798e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162206e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.568798e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162206e+00 (SUCCESS) Start 2251: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpend 1935/3626 Test #2252: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpbegin ...***Timeout 382.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.455493e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.683395e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.028590e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.113006e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.005283e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.806958e-01 s Time to initialize coeftab 6.681969e-01 s Time to factorize 4.425048e+01 s (493.09 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.106344e-01 s - iteration 1 : total iteration time 0.295 s error 6.302e-11 Time for refinement 5.885811e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.508534e-08 max(|| b_i - A x_i ||_1) 3.183547e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.033196e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.508534e-08 max(|| b_i - A x_i ||_1) 3.183547e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.033196e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.508534e-08 max(|| b_i - A x_i ||_1) 3.183547e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.033196e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.508534e-08 max(|| b_i - A x_i ||_1) 3.183547e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.033196e-01 (SUCCESS) Start 2252: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpbegin 1935/3626 Test #2253: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpend .....***Timeout 382.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.371967e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.816202e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.457297e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.179640e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.351918e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.517556e+00 s Time to initialize coeftab 5.043771e+00 s Time to factorize 1.195899e+01 s ( 1.78 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.978999e+00 s Time for refinement 2.261378e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515204e-07 max(|| b_i - A x_i ||_1) 8.568798e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162206e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515204e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515204e-07 max(|| b_i - A x_i ||_1) 8.568798e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162206e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.568798e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162206e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515204e-07 max(|| b_i - A x_i ||_1) 8.568798e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162206e+00 (SUCCESS) Start 2253: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpend 1935/3626 Test #2254: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrcpbegin ...............***Timeout 382.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.091503e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.867273e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.834467e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.881475e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.543294e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.406514e-01 s Time to initialize coeftab 9.029654e-01 s Time to factorize 1.758807e+01 s ( 1.21 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.254984e+00 s - iteration 1 : total iteration time 1.13 s error 6.3019e-11 Time for refinement 2.490112e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.619674e-08 max(|| b_i - A x_i ||_1) 3.293746e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.311267e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.619674e-08 max(|| b_i - A x_i ||_1) 3.293746e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.311267e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.619674e-08 max(|| b_i - A x_i ||_1) 3.293746e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.311267e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.619674e-08 max(|| b_i - A x_i ||_1) 3.293746e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.311267e-01 (SUCCESS) Start 2254: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrcpbegin 1935/3626 Test #2255: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrcpend .................***Timeout 382.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.403130e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.360440e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.306363e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.106466e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.662926e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.023442e+00 s Time to initialize coeftab 2.036742e+00 s Time to factorize 2.628887e+01 s (829.98 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.838691e-01 s Time for refinement 2.952071e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.514269e-07 max(|| b_i - A x_i ||_1) 8.565316e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161327e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.514269e-07 max(|| b_i - A x_i ||_1) 8.565316e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161327e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.514269e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.514269e-07 max(|| b_i - A x_i ||_1) 8.565316e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161327e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.565316e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161327e+00 (SUCCESS) Start 2255: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrcpend 1935/3626 Test #2256: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrcpbegin ..............***Timeout 381.98 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.550712e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.166950e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.775099e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.650005e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.791862e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.540226e-01 s Time to initialize coeftab 8.445099e-01 s Time to factorize 2.610157e+01 s (835.94 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.259924e-02 s - iteration 1 : total iteration time 0.0496 s error 6.3086e-11 Time for refinement 9.137772e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.609108e-08 max(|| b_i - A x_i ||_1) 3.292373e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.307802e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.609108e-08 max(|| b_i - A x_i ||_1) 3.292373e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.307802e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.609108e-08 max(|| b_i - A x_i ||_1) 3.292373e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.307802e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.609108e-08 max(|| b_i - A x_i ||_1) 3.292373e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.307802e-01 (SUCCESS) Start 2256: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrcpbegin 1935/3626 Test #2257: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrcpend ................***Timeout 381.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.139425e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.304963e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.397908e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.867899e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.456471e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.771945e-01 s Time to initialize coeftab 4.900051e-01 s Time to factorize 8.842074e+00 s ( 2.41 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.691929e-01 s Time for refinement 1.704558e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.511700e-07 max(|| b_i - A x_i ||_1) 8.560414e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.160090e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.511700e-07 max(|| b_i - A x_i ||_1) 8.560414e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.160090e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.511700e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.511700e-07 max(|| b_i - A x_i ||_1) 8.560414e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.160090e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.560414e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.160090e+00 (SUCCESS) Start 2257: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrcpend 1935/3626 Test #2258: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpbegin ...***Timeout 381.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.016755e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.610725e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.893019e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.570834e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.323015e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.428973e-01 s Time to initialize coeftab 9.616589e-01 s Time to factorize 5.100275e+01 s (427.81 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.236955e-02 s - iteration 1 : total iteration time 0.0178 s error 6.3086e-11 Time for refinement 3.080918e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.609108e-08 max(|| b_i - A x_i ||_1) 3.292373e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.307802e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.609108e-08 max(|| b_i - A x_i ||_1) 3.292373e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.307802e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.609108e-08 max(|| b_i - A x_i ||_1) 3.292373e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.307802e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.609108e-08 max(|| b_i - A x_i ||_1) 3.292373e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.307802e-01 (SUCCESS) Start 2258: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpbegin 1935/3626 Test #2259: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpend .....***Timeout 381.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.234674e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.031896e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.551471e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.721982e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.622658e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.301854e-01 s Time to initialize coeftab 4.835850e-01 s Time to factorize 2.789233e+01 s (782.27 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.376924e-01 s Time for refinement 1.378278e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.514269e-07 max(|| b_i - A x_i ||_1) 8.565316e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161327e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.514269e-07 max(|| b_i - A x_i ||_1) 8.565316e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161327e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.514269e-07 max(|| b_i - A x_i ||_1) 8.565316e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161327e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.514269e-07 max(|| b_i - A x_i ||_1) 8.565316e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161327e+00 (SUCCESS) Start 2259: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpend 1935/3626 Test #2260: mpi_dst_example_simple_lap_c_facto1_sched0_not_tqrcpbegin ...............***Timeout 381.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : 1: 300 1140 2: 200 760 3: 200 660 Ordering method is: Scotch Time to compute ordering 2.374103e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.283822e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.019643e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.964888e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.812727e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.403372e-01 s Time to initialize coeftab 4.640950e-01 s Time to factorize 1.531910e+01 s ( 1.39 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.146960e-01 s - iteration 1 : total iteration time 0.177 s error 6.3162e-11 Time for refinement 2.886341e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.709000e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.709000e-08 max(|| b_i - A x_i ||_1) 3.282527e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.282958e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.709000e-08 max(|| b_i - A x_i ||_1) 3.282527e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.282958e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.282527e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.282958e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.709000e-08 max(|| b_i - A x_i ||_1) 3.282527e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.282958e-01 (SUCCESS) Start 2260: mpi_dst_example_simple_lap_c_facto1_sched0_not_tqrcpbegin 1935/3626 Test #2261: mpi_dst_example_simple_lap_c_facto1_sched0_not_tqrcpend .................***Timeout 381.88 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.860745e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.213635e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.734300e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.354332e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.563818e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.820683e+00 s Time to initialize coeftab 1.640292e+00 s Time to factorize 1.292921e+01 s ( 1.65 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.800276e-01 s Time for refinement 1.476405e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.513135e-07 max(|| b_i - A x_i ||_1) 8.565777e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161444e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.513135e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.513135e-07 max(|| b_i - A x_i ||_1) 8.565777e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161444e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.565777e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161444e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.513135e-07 max(|| b_i - A x_i ||_1) 8.565777e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161444e+00 (SUCCESS) Start 2261: mpi_dst_example_simple_lap_c_facto1_sched0_not_tqrcpend 1935/3626 Test #2263: mpi_dst_example_simple_lap_c_facto1_sched0_kway_tqrcpend ................***Timeout 381.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.019172e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.313437e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.318552e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.776649e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.414654e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.126799e+00 s Time to initialize coeftab 9.868054e-01 s Time to factorize 2.524648e+01 s (864.25 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.190409e-01 s Time for refinement 2.916502e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.513135e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.513135e-07 max(|| b_i - A x_i ||_1) 8.565777e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161444e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.565777e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161444e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.513135e-07 max(|| b_i - A x_i ||_1) 8.565777e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161444e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.513135e-07 max(|| b_i - A x_i ||_1) 8.565777e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161444e+00 (SUCCESS) Start 2263: mpi_dst_example_simple_lap_c_facto1_sched0_kway_tqrcpend 1935/3626 Test #2264: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpbegin ...***Timeout 381.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.893271e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.490927e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.794983e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.675042e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.649621e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.985494e+00 s Time to initialize coeftab 6.399722e-01 s Time to factorize 1.601497e+01 s ( 1.33 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.251776e-03 s - iteration 1 : total iteration time 0.00676 s error 6.3055e-11 Time for refinement 1.377573e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.715538e-08 max(|| b_i - A x_i ||_1) 3.295780e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.316400e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.715538e-08 max(|| b_i - A x_i ||_1) 3.295780e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.316400e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.715538e-08 max(|| b_i - A x_i ||_1) 3.295780e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.316400e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.715538e-08 max(|| b_i - A x_i ||_1) 3.295780e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.316400e-01 (SUCCESS) Start 2264: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpbegin 1935/3626 Test #2265: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpend .....***Timeout 381.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.841661e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.422316e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.421680e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.187908e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.355241e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.642279e-01 s Time to initialize coeftab 2.977808e-01 s Time to factorize 1.531191e+01 s ( 1.39 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.036618e-01 s Time for refinement 1.246264e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.510564e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.510564e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.510564e-07 max(|| b_i - A x_i ||_1) 8.560875e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.160207e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.510564e-07 max(|| b_i - A x_i ||_1) 8.560875e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.160207e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.560875e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.160207e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.560875e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.160207e+00 (SUCCESS) Start 2265: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpend 1935/3626 Test #2266: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrrtbegin ...............***Timeout 381.81 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.381987e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.272446e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.535154e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.724299e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.820392e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.715826e-01 s Time to initialize coeftab 6.755088e-01 s Time to factorize 3.931772e+01 s (554.95 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.310309e-02 s - iteration 1 : total iteration time 0.0809 s error 6.269e-11 Time for refinement 1.760062e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603467e-08 max(|| b_i - A x_i ||_1) 3.289392e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.300281e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603467e-08 max(|| b_i - A x_i ||_1) 3.289392e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.300281e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603467e-08 max(|| b_i - A x_i ||_1) 3.289392e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.300281e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603467e-08 max(|| b_i - A x_i ||_1) 3.289392e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.300281e-01 (SUCCESS) Start 2266: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrrtbegin 1935/3626 Test #2267: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrrtend .................***Timeout 381.80 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.601046e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.003987e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.181843e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.663297e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.125963e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.482661e-01 s Time to initialize coeftab 1.477193e-01 s Time to factorize 6.458949e+00 s ( 3.30 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.893448e-01 s Time for refinement 5.940034e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.492869e-07 max(|| b_i - A x_i ||_1) 8.544968e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.156193e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.492869e-07 max(|| b_i - A x_i ||_1) 8.544968e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.156193e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.492869e-07 max(|| b_i - A x_i ||_1) 8.544968e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.156193e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.492869e-07 max(|| b_i - A x_i ||_1) 8.544968e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.156193e+00 (SUCCESS) Start 2267: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrrtend 1935/3626 Test #2268: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrrtbegin ..............***Timeout 381.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.088767e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.674634e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.345531e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.510954e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.267413e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.869909e-01 s Time to initialize coeftab 5.889213e-01 s Time to factorize 1.316070e+01 s ( 1.62 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.154615e-02 s - iteration 1 : total iteration time 0.0654 s error 6.269e-11 Time for refinement 1.823342e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603467e-08 max(|| b_i - A x_i ||_1) 3.289392e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.300281e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603467e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603467e-08 max(|| b_i - A x_i ||_1) 3.289392e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.300281e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603467e-08 max(|| b_i - A x_i ||_1) 3.289392e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.300281e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.289392e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.300281e-01 (SUCCESS) Start 2268: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrrtbegin 1935/3626 Test #2269: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrrtend ................***Timeout 381.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.051315e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.681375e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.208484e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.515030e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.318510e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.768827e-01 s Time to initialize coeftab 4.451829e-01 s Time to factorize 7.979415e+00 s ( 2.67 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.306479e-01 s Time for refinement 6.662760e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.496095e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.496095e-07 max(|| b_i - A x_i ||_1) 8.554426e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.158579e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.554426e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.158579e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.496095e-07 max(|| b_i - A x_i ||_1) 8.554426e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.158579e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.496095e-07 max(|| b_i - A x_i ||_1) 8.554426e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.158579e+00 (SUCCESS) Start 2269: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrrtend 1935/3626 Test #2270: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtbegin ...***Timeout 381.74 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.788608e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.781368e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.165784e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.272311e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.939014e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.428841e-01 s Time to initialize coeftab 8.327544e-01 s Time to factorize 2.735328e+01 s (797.69 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.731234e-03 s - iteration 1 : total iteration time 0.0111 s error 6.269e-11 Time for refinement 2.056850e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603467e-08 max(|| b_i - A x_i ||_1) 3.289392e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.300281e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603467e-08 max(|| b_i - A x_i ||_1) 3.289392e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.300281e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603467e-08 max(|| b_i - A x_i ||_1) 3.289392e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.300281e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603467e-08 max(|| b_i - A x_i ||_1) 3.289392e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.300281e-01 (SUCCESS) Start 2270: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtbegin 1935/3626 Test #2271: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtend .....***Timeout 381.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.206991e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.335640e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.188105e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.197687e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.408922e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.929036e-01 s Time to initialize coeftab 3.936010e-01 s Time to factorize 4.971467e+00 s ( 4.29 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.510593e-01 s Time for refinement 2.393946e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.492869e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.492869e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.492869e-07 max(|| b_i - A x_i ||_1) 8.544968e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.156193e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.492869e-07 max(|| b_i - A x_i ||_1) 8.544968e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.156193e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.544968e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.156193e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.544968e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.156193e+00 (SUCCESS) Start 2271: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtend 1935/3626 Test #2272: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpilu0 ...............***Timeout 381.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.573462e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.669929e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.405265e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.738779e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.860786e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.946790e-01 s Time to initialize coeftab 4.044322e-01 s Time to factorize 2.136331e+01 s (1021.35 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.200172e-01 s - iteration 1 : total iteration time 0.0857 s error 3.0457e-11 Time for refinement 1.676657e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.364894e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.364894e-08 max(|| b_i - A x_i ||_1) 3.214378e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.110993e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.214378e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.110993e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.364894e-08 max(|| b_i - A x_i ||_1) 3.214378e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.110993e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.364894e-08 max(|| b_i - A x_i ||_1) 3.214378e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.110993e-01 (SUCCESS) Start 2272: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpilu0 1935/3626 Test #2273: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpilu1 ...............***Timeout 381.65 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.806594e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.749267e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.797649e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.178974e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.108491e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.221812e+00 s Time to initialize coeftab 1.250291e+00 s Time to factorize 5.225995e+00 s ( 4.08 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.137154e-01 s - iteration 1 : total iteration time 0.632 s error 3.0429e-11 Time for refinement 1.137031e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.364818e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.364818e-08 max(|| b_i - A x_i ||_1) 3.213880e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.109738e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.364818e-08 max(|| b_i - A x_i ||_1) 3.213880e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.109738e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.213880e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.109738e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.364818e-08 max(|| b_i - A x_i ||_1) 3.213880e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.109738e-01 (SUCCESS) Start 2273: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpilu1 1935/3626 Test #2274: mpi_dst_example_simple_lap_c_facto2_sched0_not_svdbegin .................***Timeout 381.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.516869e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.993081e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.284442e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.339894e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.071267e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.339460e-01 s Time to initialize coeftab 1.392796e+00 s Time to factorize 3.931011e+01 s ( 1.02 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko Start 2274: mpi_dst_example_simple_lap_c_facto2_sched0_not_svdbegin 1935/3626 Test #2275: mpi_dst_example_simple_lap_c_facto2_sched0_not_svdend ...................***Timeout 381.59 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.096775e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.273631e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.010585e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.043392e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.653143e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.110484e+00 s Time to initialize coeftab 4.053070e+00 s Time to factorize 2.279303e+01 s ( 1.75 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.714557e-01 s Time for refinement 1.906696e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.758151e-07 max(|| b_i - A x_i ||_1) 7.579736e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.912631e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.758151e-07 max(|| b_i - A x_i ||_1) 7.579736e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.912631e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.758151e-07 max(|| b_i - A x_i ||_1) 7.579736e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.912631e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.758151e-07 max(|| b_i - A x_i ||_1) 7.579736e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.912631e+00 (SUCCESS) Start 2275: mpi_dst_example_simple_lap_c_facto2_sched0_not_svdend 1935/3626 Test #2276: mpi_dst_example_simple_lap_c_facto2_sched0_kway_svdbegin ................***Timeout 381.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.617667e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.726236e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.401894e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.102445e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.060934e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.865277e-01 s Time to initialize coeftab 1.146268e+00 s Start 2276: mpi_dst_example_simple_lap_c_facto2_sched0_kway_svdbegin 1935/3626 Test #2277: mpi_dst_example_simple_lap_c_facto2_sched0_kway_svdend ..................***Timeout 381.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.354531e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.225629e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.175031e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.773185e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.904734e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.411456e+00 s Time to initialize coeftab 3.878403e-01 s Time to factorize 1.924919e+01 s ( 2.08 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.892208e-01 s Time for refinement 2.882339e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.758176e-07 max(|| b_i - A x_i ||_1) 7.581235e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.913009e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.758176e-07 max(|| b_i - A x_i ||_1) 7.581235e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.913009e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.758176e-07 max(|| b_i - A x_i ||_1) 7.581235e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.913009e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.758176e-07 max(|| b_i - A x_i ||_1) 7.581235e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.913009e+00 (SUCCESS) Start 2277: mpi_dst_example_simple_lap_c_facto2_sched0_kway_svdend 1935/3626 Test #2278: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_svdbegin .....***Timeout 381.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.089310e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.784269e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.266063e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.581285e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.187896e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.863050e+00 s Time to initialize coeftab 5.292653e+00 s Time to factorize 4.130242e+01 s (990.97 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.593118e-01 s Time for refinement 3.865170e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.918342e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.918342e-07 max(|| b_i - A x_i ||_1) 8.370964e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.112285e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.918342e-07 max(|| b_i - A x_i ||_1) 8.370964e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.112285e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.918342e-07 max(|| b_i - A x_i ||_1) 8.370964e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.112285e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.370964e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.112285e+00 (SUCCESS) Start 2278: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_svdbegin 1935/3626 Test #2279: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_svdend .......***Timeout 381.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.673677e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.978616e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.332401e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.586247e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.246298e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.049341e-01 s Time to initialize coeftab 4.012647e-01 s Time to factorize 1.174175e+01 s ( 3.40 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.037014e-01 s Time for refinement 1.651371e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.757672e-07 max(|| b_i - A x_i ||_1) 7.581372e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.913044e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.757672e-07 max(|| b_i - A x_i ||_1) 7.581372e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.913044e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.757672e-07 max(|| b_i - A x_i ||_1) 7.581372e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.913044e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.757672e-07 max(|| b_i - A x_i ||_1) 7.581372e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.913044e+00 (SUCCESS) Start 2279: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_svdend 1935/3626 Test #2280: mpi_dst_example_simple_lap_c_facto2_sched0_not_pqrcpbegin ...............***Timeout 381.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.135925e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.293486e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.743784e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.272289e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.583967e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.209886e-01 s Time to initialize coeftab 1.409107e+00 s Time to factorize 4.215370e+01 s (970.95 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.316240e-01 s - iteration 1 : total iteration time 0.406 s error 7.3909e-11 Time for refinement 8.921319e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.399944e-08 max(|| b_i - A x_i ||_1) 3.212830e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.107088e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.399944e-08 max(|| b_i - A x_i ||_1) 3.212830e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.107088e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.399944e-08 max(|| b_i - A x_i ||_1) 3.212830e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.107088e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.399944e-08 max(|| b_i - A x_i ||_1) 3.212830e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.107088e-01 (SUCCESS) Start 2280: mpi_dst_example_simple_lap_c_facto2_sched0_not_pqrcpbegin 1935/3626 Test #2281: mpi_dst_example_simple_lap_c_facto2_sched0_not_pqrcpend .................***Timeout 381.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.095832e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.253242e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.185206e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.712606e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.469053e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.848751e-01 s Time to initialize coeftab 7.261042e-01 s Time to factorize 9.286771e+00 s ( 4.30 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 5.665813e-01 s - iteration 1 : total iteration time 0.987 s error 1.0383e-12 Time for refinement 1.963047e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.211938e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.211938e-08 max(|| b_i - A x_i ||_1) 3.083253e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.780120e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.083253e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.780120e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.211938e-08 max(|| b_i - A x_i ||_1) 3.083253e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.780120e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.211938e-08 max(|| b_i - A x_i ||_1) 3.083253e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.780120e-01 (SUCCESS) Start 2281: mpi_dst_example_simple_lap_c_facto2_sched0_not_pqrcpend 1935/3626 Test #2282: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpbegin ..............***Timeout 381.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.196869e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.099660e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.318077e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.696641e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.883502e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.040388e+01 s Time to initialize coeftab 4.205117e-01 s Time to factorize 1.534344e+01 s ( 2.61 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.478765e-01 s - iteration 1 : total iteration time 0.188 s error 7.3909e-11 Time for refinement 3.403583e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.399944e-08 max(|| b_i - A x_i ||_1) 3.212830e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.107088e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.399944e-08 max(|| b_i - A x_i ||_1) 3.212830e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.107088e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.399944e-08 max(|| b_i - A x_i ||_1) 3.212830e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.107088e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.399944e-08 max(|| b_i - A x_i ||_1) 3.212830e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.107088e-01 (SUCCESS) Start 2282: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpbegin 1935/3626 Test #2283: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpend ................***Timeout 381.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.099046e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.512659e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.505283e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.194999e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.583924e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.359820e+00 s Time to initialize coeftab 1.734030e-01 s Time to factorize 4.230281e+01 s (967.53 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.615994e+00 s - iteration 1 : total iteration time 0.999 s error 1.0457e-12 Time for refinement 1.760333e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.212735e-08 max(|| b_i - A x_i ||_1) 3.086050e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.787178e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.212735e-08 max(|| b_i - A x_i ||_1) 3.086050e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.787178e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.212735e-08 max(|| b_i - A x_i ||_1) 3.086050e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.787178e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.212735e-08 max(|| b_i - A x_i ||_1) 3.086050e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.787178e-01 (SUCCESS) Start 2283: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpend 1935/3626 Test #2284: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpbegin ...***Timeout 381.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.660100e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.369224e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.222761e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.240374e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.027147e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.368754e-01 s Time to initialize coeftab 2.463700e-01 s Time to factorize 1.885250e+01 s ( 2.12 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 8.755300e-01 s - iteration 1 : total iteration time 1.49 s error 7.3926e-11 Time for refinement 2.627738e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.397613e-08 max(|| b_i - A x_i ||_1) 3.213810e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.109562e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.397613e-08 max(|| b_i - A x_i ||_1) 3.213810e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.109562e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.397613e-08 max(|| b_i - A x_i ||_1) 3.213810e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.109562e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.397613e-08 max(|| b_i - A x_i ||_1) 3.213810e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.109562e-01 (SUCCESS) Start 2284: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpbegin 1935/3626 Test #2285: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpend .....***Timeout 381.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.062283e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.281719e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.840754e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.997229e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.281953e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.377989e-01 s Time to initialize coeftab 2.499867e+00 s Time to factorize 5.839374e+00 s ( 6.84 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.060292e-01 s - iteration 1 : total iteration time 0.403 s error 1.0457e-12 Time for refinement 7.569325e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.212735e-08 max(|| b_i - A x_i ||_1) 3.086050e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.787178e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.212735e-08 max(|| b_i - A x_i ||_1) 3.086050e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.787178e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.212735e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.212735e-08 max(|| b_i - A x_i ||_1) 3.086050e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.787178e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.086050e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.787178e-01 (SUCCESS) Start 2285: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpend 1935/3626 Test #2286: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrcpbegin ...............***Timeout 381.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.837744e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.303890e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.545396e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.406674e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.373706e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.761027e-01 s Time to initialize coeftab 9.636111e-01 s Time to factorize 1.869487e+01 s ( 2.14 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.380926e-01 s - iteration 1 : total iteration time 0.603 s error 7.3978e-11 Time for refinement 1.307190e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.568075e-08 max(|| b_i - A x_i ||_1) 3.287986e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.296733e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.568075e-08 max(|| b_i - A x_i ||_1) 3.287986e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.296733e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.568075e-08 max(|| b_i - A x_i ||_1) 3.287986e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.296733e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.568075e-08 max(|| b_i - A x_i ||_1) 3.287986e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.296733e-01 (SUCCESS) Start 2286: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrcpbegin 1935/3626 Test #2287: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrcpend .................***Timeout 381.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.153144e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.258506e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.105434e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.354094e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.737832e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.168172e-01 s Time to initialize coeftab 3.796489e+00 s Time to factorize 6.395161e+00 s ( 6.25 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.125796e-01 s - iteration 1 : total iteration time 0.026 s error 9.9002e-13 Time for refinement 6.087326e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.234826e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.234826e-08 max(|| b_i - A x_i ||_1) 3.089257e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.795270e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.234826e-08 max(|| b_i - A x_i ||_1) 3.089257e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.795270e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.234826e-08 max(|| b_i - A x_i ||_1) 3.089257e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.795270e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.089257e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.795270e-01 (SUCCESS) Start 2287: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrcpend 1935/3626 Test #2288: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrcpbegin ..............***Timeout 381.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.596248e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.748411e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.656422e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.167610e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.982068e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.725179e-01 s Time to initialize coeftab 1.443230e+00 s Time to factorize 1.580035e+01 s ( 2.53 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 4.258183e-02 s - iteration 1 : total iteration time 0.0511 s error 7.4069e-11 Time for refinement 1.025287e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.566046e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.566046e-08 max(|| b_i - A x_i ||_1) 3.285507e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.290476e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.566046e-08 max(|| b_i - A x_i ||_1) 3.285507e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.290476e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.566046e-08 max(|| b_i - A x_i ||_1) 3.285507e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.290476e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.285507e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.290476e-01 (SUCCESS) Start 2288: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrcpbegin 1935/3626 Test #2289: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrcpend ................***Timeout 381.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.075402e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.270902e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.283735e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.754151e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.527986e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.025289e-01 s Time to initialize coeftab 1.923237e-01 s Time to factorize 1.251243e+01 s ( 3.19 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.894347e-02 s - iteration 1 : total iteration time 0.0106 s error 1.0364e-12 Time for refinement 2.601214e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.236766e-08 max(|| b_i - A x_i ||_1) 3.094496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.808489e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.236766e-08 max(|| b_i - A x_i ||_1) 3.094496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.808489e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.236766e-08 max(|| b_i - A x_i ||_1) 3.094496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.808489e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.236766e-08 max(|| b_i - A x_i ||_1) 3.094496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.808489e-01 (SUCCESS) Start 2289: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrcpend 1935/3626 Test #2290: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpbegin ...***Timeout 381.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.382703e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.367992e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.704228e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.677632e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.005239e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.501933e-01 s Time to initialize coeftab 1.781135e+00 s Time to factorize 4.288027e+01 s (954.50 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.297158e+00 s - iteration 1 : total iteration time 1.95 s error 7.3975e-11 Time for refinement 3.562134e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.568087e-08 max(|| b_i - A x_i ||_1) 3.287849e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.296388e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.568087e-08 max(|| b_i - A x_i ||_1) 3.287849e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.296388e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.568087e-08 max(|| b_i - A x_i ||_1) 3.287849e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.296388e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.568087e-08 max(|| b_i - A x_i ||_1) 3.287849e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.296388e-01 (SUCCESS) Start 2290: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpbegin 1935/3626 Test #2291: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpend .....***Timeout 380.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.793822e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.655043e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.529577e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.721699e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.745090e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.043543e+00 s Time to initialize coeftab 1.545672e+00 s Time to factorize 1.868560e+01 s ( 2.14 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 4.824806e-01 s - iteration 1 : total iteration time 0.545 s error 1.0364e-12 Time for refinement 1.227544e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.236766e-08 max(|| b_i - A x_i ||_1) 3.094496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.808489e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.236766e-08 max(|| b_i - A x_i ||_1) 3.094496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.808489e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.236766e-08 max(|| b_i - A x_i ||_1) 3.094496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.808489e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.236766e-08 max(|| b_i - A x_i ||_1) 3.094496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.808489e-01 (SUCCESS) Start 2291: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpend 1935/3626 Test #2292: mpi_dst_example_simple_lap_c_facto2_sched0_not_tqrcpbegin ...............***Timeout 375.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.299430e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.627310e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.576552e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.052549e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.780374e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.148635e+00 s Time to initialize coeftab 9.417016e-01 s Time to factorize 1.293092e+01 s ( 3.09 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2292: mpi_dst_example_simple_lap_c_facto2_sched0_not_tqrcpbegin 1935/3626 Test #2293: mpi_dst_example_simple_lap_c_facto2_sched0_not_tqrcpend .................***Timeout 375.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.712952e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.533462e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.258746e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.254491e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.159256e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.824794e-01 s Time to initialize coeftab 8.521020e-01 s Time to factorize 1.193897e+01 s ( 3.35 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 4.750062e-01 s - iteration 1 : total iteration time 1.29 s error 1.1809e-12 Time for refinement 2.154872e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.242607e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.242607e-08 max(|| b_i - A x_i ||_1) 3.087002e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.789579e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.242607e-08 max(|| b_i - A x_i ||_1) 3.087002e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.789579e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.087002e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.789579e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.242607e-08 max(|| b_i - A x_i ||_1) 3.087002e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.789579e-01 (SUCCESS) Start 2293: mpi_dst_example_simple_lap_c_facto2_sched0_not_tqrcpend 1935/3626 Test #2294: mpi_dst_example_simple_lap_c_facto2_sched0_kway_tqrcpbegin ..............***Timeout 374.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.274239e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.748852e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.822608e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.118402e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.592481e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.416378e-01 s Time to initialize coeftab 1.346014e+00 s Time to factorize 2.422231e+01 s ( 1.65 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.592273e-01 s - iteration 1 : total iteration time 0.153 s error 7.3926e-11 Time for refinement 4.024059e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.570428e-08 max(|| b_i - A x_i ||_1) 3.278091e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.271766e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.570428e-08 max(|| b_i - A x_i ||_1) 3.278091e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.271766e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.570428e-08 max(|| b_i - A x_i ||_1) 3.278091e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.271766e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.570428e-08 max(|| b_i - A x_i ||_1) 3.278091e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.271766e-01 (SUCCESS) Start 2294: mpi_dst_example_simple_lap_c_facto2_sched0_kway_tqrcpbegin 1935/3626 Test #2295: mpi_dst_example_simple_lap_c_facto2_sched0_kway_tqrcpend ................***Timeout 373.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.553398e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.005209e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.317762e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.974941e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.802381e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.465340e-01 s Time to initialize coeftab 3.489936e-01 s Time to factorize 1.219875e+01 s ( 3.28 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 6.305910e-01 s - iteration 1 : total iteration time 0.653 s error 1.069e-12 Time for refinement 1.079930e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.244326e-08 max(|| b_i - A x_i ||_1) 3.093329e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.805545e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.244326e-08 max(|| b_i - A x_i ||_1) 3.093329e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.805545e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.244326e-08 max(|| b_i - A x_i ||_1) 3.093329e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.805545e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.244326e-08 max(|| b_i - A x_i ||_1) 3.093329e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.805545e-01 (SUCCESS) Start 2295: mpi_dst_example_simple_lap_c_facto2_sched0_kway_tqrcpend 1935/3626 Test #2296: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpbegin ...***Timeout 373.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.358591e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.244616e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.182136e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.506017e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.872441e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.874472e-01 s Time to initialize coeftab 2.867406e+00 s Time to factorize 3.743098e+01 s ( 1.07 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.116816e-02 s - iteration 1 : total iteration time 0.0102 s error 7.3644e-11 Time for refinement 1.894110e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.542296e-08 max(|| b_i - A x_i ||_1) 3.272508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.257678e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.542296e-08 max(|| b_i - A x_i ||_1) 3.272508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.257678e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.542296e-08 max(|| b_i - A x_i ||_1) 3.272508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.257678e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.542296e-08 max(|| b_i - A x_i ||_1) 3.272508e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.257678e-01 (SUCCESS) Start 2296: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpbegin 1935/3626 Test #2297: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpend .....***Timeout 373.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.924941e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.693254e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.961622e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.873804e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.621385e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.538169e+00 s Time to initialize coeftab 9.445824e-01 s Time to factorize 2.570080e+01 s ( 1.56 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 4.317894e-01 s - iteration 1 : total iteration time 1.46 s error 1.32e-12 Time for refinement 2.662620e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.243290e-08 max(|| b_i - A x_i ||_1) 3.088328e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.792926e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.243290e-08 max(|| b_i - A x_i ||_1) 3.088328e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.792926e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.243290e-08 max(|| b_i - A x_i ||_1) 3.088328e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.792926e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.243290e-08 max(|| b_i - A x_i ||_1) 3.088328e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.792926e-01 (SUCCESS) Start 2297: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpend 1935/3626 Test #2298: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrrtbegin ...............***Timeout 372.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.945667e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.577838e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.874476e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.906220e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.205329e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.362535e+00 s Time to initialize coeftab 2.805978e+00 s Time to factorize 1.127859e+01 s ( 3.54 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.051270e-02 s - iteration 1 : total iteration time 0.0082 s error 7.3518e-11 Time for refinement 2.427953e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.453004e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.453004e-08 max(|| b_i - A x_i ||_1) 3.271340e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.254730e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.271340e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.254730e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.453004e-08 max(|| b_i - A x_i ||_1) 3.271340e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.254730e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.453004e-08 max(|| b_i - A x_i ||_1) 3.271340e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.254730e-01 (SUCCESS) Start 2298: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrrtbegin 1935/3626 Test #2299: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrrtend .................***Timeout 372.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.257572e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.264752e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.786078e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.410687e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.881010e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.711304e+00 s Time to initialize coeftab 1.796340e+00 s Time to factorize 1.299763e+01 s ( 3.08 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 4.295600e-01 s - iteration 1 : total iteration time 0.836 s error 1.2839e-12 Time for refinement 1.627082e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.281972e-08 max(|| b_i - A x_i ||_1) 3.109290e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.845820e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.281972e-08 max(|| b_i - A x_i ||_1) 3.109290e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.845820e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.281972e-08 max(|| b_i - A x_i ||_1) 3.109290e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.845820e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.281972e-08 max(|| b_i - A x_i ||_1) 3.109290e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.845820e-01 (SUCCESS) Start 2299: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrrtend Test #1720: shm_example_simple_lap_z_facto3_sched4_kway_svdend ......................***Timeout 369.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.613600e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.192736e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.389985e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 1.029269e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.759074e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.067279e-01 s Time to initialize coeftab 2.353532e-01 s Time to factorize 5.555825e+00 s ( 3.65 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 4.741120e+00 s Time for refinement 3.424044e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.035563e-16 max(|| b_i - A x_i ||_1) 2.037040e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.140144e-03 (SUCCESS) 1936/3626 Test #2300: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrrtbegin ..............***Timeout 369.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.517833e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.095931e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.618142e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.058538e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.888447e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.524657e+00 s Time to initialize coeftab 3.216386e+00 s Time to factorize 9.404240e+00 s ( 4.25 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 6.214869e-02 s - iteration 1 : total iteration time 0.0739 s error 7.3564e-11 Time for refinement 1.987653e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.451760e-08 max(|| b_i - A x_i ||_1) 3.271561e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.255287e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.451760e-08 max(|| b_i - A x_i ||_1) 3.271561e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.255287e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.451760e-08 max(|| b_i - A x_i ||_1) 3.271561e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.255287e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.451760e-08 max(|| b_i - A x_i ||_1) 3.271561e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.255287e-01 (SUCCESS) Start 2300: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrrtbegin 1936/3626 Test #2301: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrrtend ................***Timeout 367.81 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.996574e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.256904e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.040882e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.631323e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.498932e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.493723e-01 s Time to initialize coeftab 2.237380e-01 s Time to factorize 4.938717e+00 s ( 8.09 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 7.949764e-02 s - iteration 1 : total iteration time 0.00728 s error 1.2914e-12 Time for refinement 1.608875e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.281077e-08 max(|| b_i - A x_i ||_1) 3.107906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.842327e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.281077e-08 max(|| b_i - A x_i ||_1) 3.107906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.842327e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.281077e-08 max(|| b_i - A x_i ||_1) 3.107906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.842327e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.281077e-08 max(|| b_i - A x_i ||_1) 3.107906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.842327e-01 (SUCCESS) Start 2301: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrrtend 1936/3626 Test #2302: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtbegin ...***Timeout 365.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.026292e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.252826e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.967049e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.573004e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.555933e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.811387e-01 s Time to initialize coeftab 1.369019e+00 s Time to factorize 2.148250e+01 s ( 1.86 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 6.195487e-01 s - iteration 1 : total iteration time 1.1 s error 7.3564e-11 Time for refinement 2.158758e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.451760e-08 max(|| b_i - A x_i ||_1) 3.271561e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.255287e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.451760e-08 max(|| b_i - A x_i ||_1) 3.271561e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.255287e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.451760e-08 max(|| b_i - A x_i ||_1) 3.271561e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.255287e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.451760e-08 max(|| b_i - A x_i ||_1) 3.271561e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.255287e-01 (SUCCESS) Start 2302: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtbegin 1936/3626 Test #2303: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtend .....***Timeout 365.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.423189e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.384582e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.203215e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.283135e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.774218e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.909093e+00 s Time to initialize coeftab 1.253692e-01 s Time to factorize 1.127544e+01 s ( 3.54 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.080814e-01 s - iteration 1 : total iteration time 0.129 s error 1.2914e-12 Time for refinement 2.859358e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.281077e-08 max(|| b_i - A x_i ||_1) 3.107906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.842327e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.281077e-08 max(|| b_i - A x_i ||_1) 3.107906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.842327e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.281077e-08 max(|| b_i - A x_i ||_1) 3.107906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.842327e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.281077e-08 max(|| b_i - A x_i ||_1) 3.107906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.842327e-01 (SUCCESS) Start 2303: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtend 1936/3626 Test #2304: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpilu0 ...............***Timeout 360.97 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.560014e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.360688e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.335338e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.779515e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.946788e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.065897e+00 s Time to initialize coeftab 5.490561e-01 s Time to factorize 2.513914e+01 s ( 1.59 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.409055e-02 s - iteration 1 : total iteration time 0.021 s error 5.5189e-11 Time for refinement 5.648351e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.421761e-08 max(|| b_i - A x_i ||_1) 3.227130e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.143172e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.421761e-08 max(|| b_i - A x_i ||_1) 3.227130e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.143172e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.421761e-08 max(|| b_i - A x_i ||_1) 3.227130e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.143172e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.421761e-08 max(|| b_i - A x_i ||_1) 3.227130e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.143172e-01 (SUCCESS) Start 2304: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpilu0 1936/3626 Test #2305: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpilu1 ...............***Timeout 359.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.728018e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.582780e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.478407e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.956069e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.081548e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.692389e-01 s Time to initialize coeftab 5.345987e-01 s Time to factorize 1.271312e+01 s ( 3.14 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 4.625929e-01 s - iteration 1 : total iteration time 1.39 s error 5.5204e-11 Time for refinement 2.313316e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.421416e-08 max(|| b_i - A x_i ||_1) 3.226208e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.140844e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.421416e-08 max(|| b_i - A x_i ||_1) 3.226208e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.140844e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.421416e-08 max(|| b_i - A x_i ||_1) 3.226208e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.140844e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.421416e-08 max(|| b_i - A x_i ||_1) 3.226208e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.140844e-01 (SUCCESS) Start 2305: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpilu1 1936/3626 Test #2307: mpi_dst_example_simple_lap_c_facto3_sched0_not_svdend ...................***Timeout 354.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.927811e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.164751e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.910705e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.578410e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.635381e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.641034e+00 s Time to initialize coeftab 6.262095e-01 s Time to factorize 3.219799e+01 s (645.00 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.135088e+00 s Time for refinement 3.206466e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) Start 2307: mpi_dst_example_simple_lap_c_facto3_sched0_not_svdend 1936/3626 Test #2309: mpi_dst_example_simple_lap_c_facto3_sched0_kway_svdend ..................***Timeout 352.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.735666e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.110157e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.998087e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.020623e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.120382e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.933899e-01 s Time to initialize coeftab 3.721965e-01 s Time to factorize 1.447143e+01 s ( 1.40 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.500960e-01 s Time for refinement 3.010524e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) Start 2309: mpi_dst_example_simple_lap_c_facto3_sched0_kway_svdend 1936/3626 Test #2311: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_svdend .......***Timeout 350.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.385101e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.330537e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.158111e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.120103e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.770567e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.152362e-01 s Time to initialize coeftab 5.699488e-01 s Time to factorize 2.930804e+01 s (708.60 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.496485e-02 s Time for refinement 2.602594e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896185e-07 max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.679255e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.190078e+00 (SUCCESS) Start 2311: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_svdend 1936/3626 Test #2312: mpi_dst_example_simple_lap_c_facto3_sched0_not_pqrcpbegin ...............***Timeout 349.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.563202e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.038077e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.762692e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.748959e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.943749e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.019884e-01 s Time to initialize coeftab 4.846072e-01 s Time to factorize 2.539963e+01 s (817.64 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2312: mpi_dst_example_simple_lap_c_facto3_sched0_not_pqrcpbegin Test #1745: shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtbegin .......***Timeout 346.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.260433e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.222097e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.194812e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 6.178973e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.358123e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.166804e-01 s Time to initialize coeftab 1.963771e+00 s Time to factorize 1.135838e+01 s ( 1.79 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 1.800526e+00 s - iteration 1 : total iteration time 3.7 s error 1.058e-13 Time for refinement 7.580892e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.058020e-13 max(|| b_i - A x_i ||_1) 1.958121e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.941003e-01 (SUCCESS) Test #1747: shm_example_simple_lap_z_facto3_sched4_kway_pqrcpilu0 ...................***Timeout 346.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.925039e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.684119e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.300041e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.462050e-04 s Time for mapping/scheduling 8.970134e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.033848e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.500604e-01 s Time to initialize coeftab 2.849880e-01 s Time to factorize 3.350236e+00 s ( 6.05 MFlop/s) Number of operations 25.68 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 1.472492e+00 s - iteration 1 : total iteration time 8.01 s error 1.5454e-15 Time for refinement 1.309573e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.546433e-15 max(|| b_i - A x_i ||_1) 1.987572e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.015318e-03 (SUCCESS) Test #1752: shm_example_simple_lap_z_facto4_sched4_kway_svdend ......................***Timeout 346.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.378330e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.991214e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.007013e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.228226e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.558865e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 8.548775e-02 s Time to initialize coeftab 1.449188e+00 s Time to factorize 4.671897e+00 s ( 4.56 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 5.498485e+00 s Time for refinement 3.048432e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.860016e-16 max(|| b_i - A x_i ||_1) 1.875390e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.732245e-03 (SUCCESS) Test #1756: shm_example_simple_lap_z_facto4_sched4_not_pqrcpend .....................***Timeout 346.67 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.856620e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.327282e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.417198e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.520349e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.972882e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.102524e-01 s Time to initialize coeftab 1.010848e+00 s Time to factorize 3.721195e+00 s ( 5.73 MFlop/s) Number of operations 26.71 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 7.03 Ko Outside 8.44 Ko Low-rank supernodes Diag in diag 497 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 764 Ko / 764 Ko ------------------------------------------------ Total 1.25 Mo / 1.25 Mo Time to solve 3.777548e+00 s Time for refinement 3.541499e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.814898e-16 max(|| b_i - A x_i ||_1) 1.868574e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.715046e-03 (SUCCESS) 1940/3626 Test #2313: mpi_dst_example_simple_lap_c_facto3_sched0_not_pqrcpend .................***Timeout 346.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.775070e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.391446e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.039256e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.689599e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.574823e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.331203e-01 s Time to initialize coeftab 1.955841e-01 s Time to factorize 9.345766e+00 s ( 2.17 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.965510e-01 s Time for refinement 3.085217e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.585253e-07 max(|| b_i - A x_i ||_1) 9.343217e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357619e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.585253e-07 max(|| b_i - A x_i ||_1) 9.343217e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357619e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.585253e-07 max(|| b_i - A x_i ||_1) 9.343217e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357619e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.585253e-07 max(|| b_i - A x_i ||_1) 9.343217e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357619e+00 (SUCCESS) Start 2313: mpi_dst_example_simple_lap_c_facto3_sched0_not_pqrcpend 1940/3626 Test #2314: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpbegin ..............***Timeout 346.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.618023e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.099251e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.615326e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.982671e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.874012e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.340678e-01 s Time to initialize coeftab 2.121170e-01 s Time to factorize 4.987093e+00 s ( 4.07 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.038749e-01 s - iteration 1 : total iteration time 0.994 s error 6.3171e-11 Time for refinement 1.681390e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.630273e-08 max(|| b_i - A x_i ||_1) 3.381039e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531539e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.630273e-08 max(|| b_i - A x_i ||_1) 3.381039e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531539e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.630273e-08 max(|| b_i - A x_i ||_1) 3.381039e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531539e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.630273e-08 max(|| b_i - A x_i ||_1) 3.381039e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531539e-01 (SUCCESS) Start 2314: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpbegin 1940/3626 Test #2315: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpend ................***Timeout 346.61 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.494249e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.014225e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.894156e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.963080e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.815054e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.262106e-01 s Time to initialize coeftab 2.419980e-01 s Time to factorize 8.115175e+00 s ( 2.50 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.602199e-01 s Time for refinement 6.164649e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582568e-07 max(|| b_i - A x_i ||_1) 9.326536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.353409e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582568e-07 max(|| b_i - A x_i ||_1) 9.326536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.353409e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582568e-07 max(|| b_i - A x_i ||_1) 9.326536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.353409e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.582568e-07 max(|| b_i - A x_i ||_1) 9.326536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.353409e+00 (SUCCESS) Start 2315: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpend 1940/3626 Test #2316: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpbegin ...***Timeout 346.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.495333e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.094754e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.045439e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.992575e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.958479e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.698088e-01 s Time to initialize coeftab 5.306227e-01 s Time to factorize 2.440075e+01 s (851.11 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.451303e-01 s - iteration 1 : total iteration time 0.317 s error 6.311e-11 Time for refinement 6.916416e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) Start 2316: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpbegin Test #1823: c_mpi_rep_example_simple_solve_and_refine_lap_c_facto4 ..................***Timeout 343.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.362401e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.418338e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.135237e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.092307e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.485280e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.577490e-03 s Time to initialize coeftab 5.048121e-01 s Time to factorize 4.492012e+00 s ( 4.74 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 3.682998e+00 s Time for refinement 3.089632e+00 s || A ||_1 5.112398e-02 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.831175e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.831175e-07 max(|| b_i - A x_i ||_1) 8.038701e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.028405e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.038701e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.028405e+00 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 4.805480e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.043821e-01 (SUCCESS) max(|| x0_i - x_i ||_oo) 4.805480e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.043821e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.831175e-07 max(|| b_i - A x_i ||_1) 8.038701e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.028405e+00 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 4.805480e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.043821e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.831175e-07 max(|| b_i - A x_i ||_1) 8.038701e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.028405e+00 (SUCCESS) max(|| x_i ||_oo) 6.822263e-01 max(|| x0_i - x_i ||_oo) 4.805480e-07 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 7.043821e-01 (SUCCESS) 1941/3626 Test #2317: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpend .....***Timeout 340.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.597234e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.634169e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.305653e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.050532e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.871571e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.724418e-01 s Time to initialize coeftab 1.694876e-01 s Time to factorize 5.032084e+00 s ( 4.03 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.025137e+00 s Time for refinement 7.863748e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.584095e-07 max(|| b_i - A x_i ||_1) 9.330522e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.354415e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.584095e-07 max(|| b_i - A x_i ||_1) 9.330522e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.354415e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.584095e-07 max(|| b_i - A x_i ||_1) 9.330522e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.354415e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.584095e-07 max(|| b_i - A x_i ||_1) 9.330522e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.354415e+00 (SUCCESS) Start 2317: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpend 1941/3626 Test #2318: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrcpbegin ...............***Timeout 339.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.112409e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.477066e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.448472e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.400315e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.281555e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.815857e-01 s Time to initialize coeftab 3.285132e-01 s Time to factorize 1.566941e+01 s ( 1.29 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.081028e-03 s - iteration 1 : total iteration time 0.00665 s error 6.3329e-11 Time for refinement 1.369239e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607526e-08 max(|| b_i - A x_i ||_1) 3.369537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.502515e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607526e-08 max(|| b_i - A x_i ||_1) 3.369537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.502515e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607526e-08 max(|| b_i - A x_i ||_1) 3.369537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.502515e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.607526e-08 max(|| b_i - A x_i ||_1) 3.369537e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.502515e-01 (SUCCESS) Start 2318: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrcpbegin 1941/3626 Test #2319: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrcpend .................***Timeout 338.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.874240e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.911323e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.733817e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.424019e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.230078e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.369258e-01 s Time to initialize coeftab 4.008692e-01 s Time to factorize 1.473747e+01 s ( 1.38 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.781041e-02 s Time for refinement 1.174872e-02 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589407e-07 max(|| b_i - A x_i ||_1) 9.342030e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357319e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589407e-07 max(|| b_i - A x_i ||_1) 9.342030e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357319e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589407e-07 max(|| b_i - A x_i ||_1) 9.342030e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357319e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589407e-07 max(|| b_i - A x_i ||_1) 9.342030e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357319e+00 (SUCCESS) Start 2319: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrcpend 1941/3626 Test #2320: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrcpbegin ..............***Timeout 335.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.489348e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.257635e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.619053e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.691047e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.713141e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.985739e-01 s Time to initialize coeftab 1.260950e+00 s Time to factorize 8.263617e+00 s ( 2.45 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.032550e-01 s - iteration 1 : total iteration time 0.0591 s error 6.3249e-11 Time for refinement 1.539009e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.612598e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.612598e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.612598e-08 max(|| b_i - A x_i ||_1) 3.376479e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.520032e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.612598e-08 max(|| b_i - A x_i ||_1) 3.376479e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.520032e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.376479e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.520032e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.376479e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.520032e-01 (SUCCESS) Start 2320: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrcpbegin 1941/3626 Test #2321: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrcpend ................***Timeout 332.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.560855e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.408362e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.025560e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.370511e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.810661e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.676106e-01 s Time to initialize coeftab 6.107387e-01 s Time to factorize 5.541073e+00 s ( 3.66 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.819171e-02 s Time for refinement 5.701638e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) Start 2321: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrcpend 1941/3626 Test #2322: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpbegin ...***Timeout 330.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.822123e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.512328e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.621740e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.411184e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.839401e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.344357e+00 s Time to initialize coeftab 9.992549e-01 s Start 2322: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpbegin 1941/3626 Test #2323: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpend .....***Timeout 326.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.505207e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.188500e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.199453e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.337287e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.679933e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.136025e+00 s Time to initialize coeftab 4.091995e-01 s Time to factorize 1.386560e+01 s ( 1.46 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.332533e-01 s Time for refinement 2.062955e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) Start 2323: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpend 1941/3626 Test #2324: mpi_dst_example_simple_lap_c_facto3_sched0_not_tqrcpbegin ...............***Timeout 323.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.861801e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.463269e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.920122e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.340963e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.245588e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.455103e-01 s Time to initialize coeftab 8.460064e-01 s Time to factorize 3.029260e+01 s (685.57 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.710013e-01 s - iteration 1 : total iteration time 0.0592 s error 6.2991e-11 Time for refinement 1.314853e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.536904e-08 max(|| b_i - A x_i ||_1) 3.333061e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.410473e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.536904e-08 max(|| b_i - A x_i ||_1) 3.333061e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.410473e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.536904e-08 max(|| b_i - A x_i ||_1) 3.333061e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.410473e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.536904e-08 max(|| b_i - A x_i ||_1) 3.333061e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.410473e-01 (SUCCESS) Start 2324: mpi_dst_example_simple_lap_c_facto3_sched0_not_tqrcpbegin 1941/3626 Test #2325: mpi_dst_example_simple_lap_c_facto3_sched0_not_tqrcpend .................***Timeout 319.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.501293e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.069965e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.617806e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.281008e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.706309e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.047618e+00 s Time to initialize coeftab 4.628078e-01 s Time to factorize 1.203274e+01 s ( 1.69 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.873233e-01 s Time for refinement 6.347080e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.586165e-07 max(|| b_i - A x_i ||_1) 9.343970e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357809e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.586165e-07 max(|| b_i - A x_i ||_1) 9.343970e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357809e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.586165e-07 max(|| b_i - A x_i ||_1) 9.343970e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357809e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.586165e-07 max(|| b_i - A x_i ||_1) 9.343970e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357809e+00 (SUCCESS) Start 2325: mpi_dst_example_simple_lap_c_facto3_sched0_not_tqrcpend 1941/3626 Test #2326: mpi_dst_example_simple_lap_c_facto3_sched0_kway_tqrcpbegin ..............***Timeout 317.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.229831e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.440397e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.549452e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.703116e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.441474e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.908994e-01 s Time to initialize coeftab 8.187715e-01 s Time to factorize 9.370150e+00 s ( 2.16 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.356610e-02 s - iteration 1 : total iteration time 0.0316 s error 6.3139e-11 Time for refinement 1.100035e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.561991e-08 max(|| b_i - A x_i ||_1) 3.342382e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.433992e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.561991e-08 max(|| b_i - A x_i ||_1) 3.342382e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.433992e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.561991e-08 max(|| b_i - A x_i ||_1) 3.342382e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.433992e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.561991e-08 max(|| b_i - A x_i ||_1) 3.342382e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.433992e-01 (SUCCESS) Start 2326: mpi_dst_example_simple_lap_c_facto3_sched0_kway_tqrcpbegin 1941/3626 Test #2327: mpi_dst_example_simple_lap_c_facto3_sched0_kway_tqrcpend ................***Timeout 317.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.895108e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.702715e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.347901e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.897549e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.109507e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.559674e-01 s Time to initialize coeftab 4.209652e-01 s Time to factorize 6.608061e+00 s ( 3.07 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.686069e-02 s Time for refinement 1.583597e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.586768e-07 max(|| b_i - A x_i ||_1) 9.344546e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357954e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.586768e-07 max(|| b_i - A x_i ||_1) 9.344546e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357954e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.586768e-07 max(|| b_i - A x_i ||_1) 9.344546e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357954e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.586768e-07 max(|| b_i - A x_i ||_1) 9.344546e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357954e+00 (SUCCESS) Start 2327: mpi_dst_example_simple_lap_c_facto3_sched0_kway_tqrcpend 1941/3626 Test #2328: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpbegin ...***Timeout 313.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.300528e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.080405e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.201649e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.967580e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.455782e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.462571e+00 s Time to initialize coeftab 5.120869e-01 s Time to factorize 1.018288e+01 s ( 1.99 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.056337e-01 s - iteration 1 : total iteration time 0.112 s error 6.2991e-11 Time for refinement 1.915103e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.536904e-08 max(|| b_i - A x_i ||_1) 3.333061e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.410473e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.536904e-08 max(|| b_i - A x_i ||_1) 3.333061e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.410473e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.536904e-08 max(|| b_i - A x_i ||_1) 3.333061e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.410473e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.536904e-08 max(|| b_i - A x_i ||_1) 3.333061e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.410473e-01 (SUCCESS) Start 2328: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpbegin 1941/3626 Test #2329: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpend .....***Timeout 313.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.726570e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.160365e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.624408e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.619858e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.318441e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.331392e+00 s Time to initialize coeftab 2.137767e+00 s Time to factorize 2.614856e+01 s (794.22 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.489161e-01 s Time for refinement 2.243300e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.586165e-07 max(|| b_i - A x_i ||_1) 9.343970e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357809e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.586165e-07 max(|| b_i - A x_i ||_1) 9.343970e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357809e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.586165e-07 max(|| b_i - A x_i ||_1) 9.343970e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357809e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.586165e-07 max(|| b_i - A x_i ||_1) 9.343970e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357809e+00 (SUCCESS) Start 2329: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpend 1941/3626 Test #2330: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrrtbegin ...............***Timeout 312.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.370403e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.610971e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.983516e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.411423e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.765335e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.920081e-01 s Time to initialize coeftab 4.743434e+00 s Time to factorize 2.054383e+01 s (1010.89 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.970085e-01 s - iteration 1 : total iteration time 0.391 s error 6.2709e-11 Time for refinement 9.070021e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.529413e-08 max(|| b_i - A x_i ||_1) 3.320649e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.379152e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.529413e-08 max(|| b_i - A x_i ||_1) 3.320649e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.379152e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.529413e-08 max(|| b_i - A x_i ||_1) 3.320649e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.379152e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.529413e-08 max(|| b_i - A x_i ||_1) 3.320649e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.379152e-01 (SUCCESS) Start 2330: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrrtbegin 1941/3626 Test #2331: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrrtend .................***Timeout 312.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : 1: 300 1140 2: 200 760 3: 200 660 Ordering method is: Scotch Time to compute ordering 2.523232e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.469581e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.247363e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.692310e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.832700e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.633892e-01 s Time to initialize coeftab 2.973443e-01 s Time to factorize 7.061984e+00 s ( 2.87 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.900489e-02 s Time for refinement 1.202354e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.590452e-07 max(|| b_i - A x_i ||_1) 9.349400e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.359179e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.590452e-07 max(|| b_i - A x_i ||_1) 9.349400e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.359179e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.590452e-07 max(|| b_i - A x_i ||_1) 9.349400e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.359179e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.590452e-07 max(|| b_i - A x_i ||_1) 9.349400e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.359179e+00 (SUCCESS) Start 2331: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrrtend 1941/3626 Test #2332: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrrtbegin ..............***Timeout 310.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.565404e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.149156e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.190854e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.260030e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.876543e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.398166e-01 s Time to initialize coeftab 5.476501e-01 s Time to factorize 9.897180e+00 s ( 2.05 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.799956e-01 s - iteration 1 : total iteration time 0.441 s error 6.2709e-11 Time for refinement 9.457235e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.529413e-08 max(|| b_i - A x_i ||_1) 3.320649e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.379152e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.529413e-08 max(|| b_i - A x_i ||_1) 3.320649e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.379152e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.529413e-08 max(|| b_i - A x_i ||_1) 3.320649e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.379152e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.529413e-08 max(|| b_i - A x_i ||_1) 3.320649e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.379152e-01 (SUCCESS) Start 2332: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrrtbegin 1941/3626 Test #2333: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrrtend ................***Timeout 310.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.677404e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.857050e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.697334e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.103123e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.956129e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.799997e-01 s Time to initialize coeftab 3.428621e+00 s Time to factorize 3.917467e+00 s ( 5.18 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.746112e-01 s Time for refinement 2.942692e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.593947e-07 max(|| b_i - A x_i ||_1) 9.367327e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.363703e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.593947e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.593947e-07 max(|| b_i - A x_i ||_1) 9.367327e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.363703e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.367327e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.363703e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.593947e-07 max(|| b_i - A x_i ||_1) 9.367327e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.363703e+00 (SUCCESS) Start 2333: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrrtend 1941/3626 Test #2334: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtbegin ...***Timeout 309.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.744693e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.960836e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.903833e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.328325e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.124452e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.361179e-01 s Time to initialize coeftab 3.172650e-01 s Time to factorize 1.000478e+01 s ( 2.03 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.659395e-02 s - iteration 1 : total iteration time 0.0312 s error 6.2898e-11 Time for refinement 6.712357e-02 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.535075e-08 max(|| b_i - A x_i ||_1) 3.320843e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.379643e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.535075e-08 max(|| b_i - A x_i ||_1) 3.320843e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.379643e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.535075e-08 max(|| b_i - A x_i ||_1) 3.320843e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.379643e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.535075e-08 max(|| b_i - A x_i ||_1) 3.320843e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.379643e-01 (SUCCESS) Start 2334: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtbegin 1941/3626 Test #2335: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtend .....***Timeout 309.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.925291e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.497973e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.093708e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.374435e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.128502e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.671730e-01 s Time to initialize coeftab 5.059328e-01 s Time to factorize 2.614653e+01 s (794.28 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.319859e-01 s Time for refinement 1.099768e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.593985e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.593985e-07 max(|| b_i - A x_i ||_1) 9.362829e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.362568e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.593985e-07 max(|| b_i - A x_i ||_1) 9.362829e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.362568e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.362829e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.362568e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.593985e-07 max(|| b_i - A x_i ||_1) 9.362829e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.362568e+00 (SUCCESS) Start 2335: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtend 1941/3626 Test #2336: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpilu0 ...............***Timeout 301.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.449610e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.955718e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.116120e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.998462e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.918588e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.612060e-01 s Time to initialize coeftab 1.375076e-01 s Time to factorize 1.448644e+01 s ( 1.40 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.259243e-01 s - iteration 1 : total iteration time 0.0373 s error 3.0421e-11 Time for refinement 1.056568e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.418161e-08 max(|| b_i - A x_i ||_1) 3.231556e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.154340e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.418161e-08 max(|| b_i - A x_i ||_1) 3.231556e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.154340e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.418161e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.418161e-08 max(|| b_i - A x_i ||_1) 3.231556e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.154340e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.231556e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.154340e-01 (SUCCESS) Start 2336: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpilu0 1941/3626 Test #2337: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpilu1 ...............***Timeout 295.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.409225e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.330180e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.041538e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.003318e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.530118e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.997163e-01 s Time to initialize coeftab 1.110281e-01 s Time to factorize 1.548903e+01 s ( 1.31 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.275497e-03 s - iteration 1 : total iteration time 0.00666 s error 3.0421e-11 Time for refinement 1.436866e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.418161e-08 max(|| b_i - A x_i ||_1) 3.231556e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.154340e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.418161e-08 max(|| b_i - A x_i ||_1) 3.231556e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.154340e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.418161e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.418161e-08 max(|| b_i - A x_i ||_1) 3.231556e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.154340e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.231556e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.154340e-01 (SUCCESS) Start 2337: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpilu1 1941/3626 Test #2339: mpi_dst_example_simple_lap_c_facto4_sched0_not_svdend ...................***Timeout 289.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.985218e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.630800e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.612675e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.543064e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.551834e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.995604e-01 s Time to initialize coeftab 3.917767e-01 s Time to factorize 5.965995e+00 s ( 3.57 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.796059e-01 s Time for refinement 4.244768e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) Start 2339: mpi_dst_example_simple_lap_c_facto4_sched0_not_svdend 1941/3626 Test #2340: mpi_dst_example_simple_lap_c_facto4_sched0_kway_svdbegin ................***Timeout 288.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.315663e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.476642e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.211404e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.375124e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.847098e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.583470e-01 s Time to initialize coeftab 1.214988e+00 s Time to factorize 1.415864e+01 s ( 1.50 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.730044e-03 s Time for refinement 5.885003e-03 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.935636e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.935636e-07 max(|| b_i - A x_i ||_1) 8.673459e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.188615e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.935636e-07 max(|| b_i - A x_i ||_1) 8.673459e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.188615e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.935636e-07 max(|| b_i - A x_i ||_1) 8.673459e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.188615e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.673459e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.188615e+00 (SUCCESS) Start 2340: mpi_dst_example_simple_lap_c_facto4_sched0_kway_svdbegin 1941/3626 Test #2341: mpi_dst_example_simple_lap_c_facto4_sched0_kway_svdend ..................***Timeout 288.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.207863e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.852529e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.172915e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.252597e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.256890e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.232733e-01 s Time to initialize coeftab 5.820349e-01 s Time to factorize 2.264631e+01 s (963.48 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.411934e-01 s Time for refinement 2.186910e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) Start 2341: mpi_dst_example_simple_lap_c_facto4_sched0_kway_svdend 1941/3626 Test #2343: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_svdend .......***Timeout 287.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.705820e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.594159e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.447518e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.390420e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.902750e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.664709e-01 s Time to initialize coeftab 3.937515e-01 s Time to factorize 2.040827e+01 s ( 1.04 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.062110e-02 s Time for refinement 2.979265e-02 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813394e-07 max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 7.923965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.999492e+00 (SUCCESS) Start 2343: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_svdend 1941/3626 Test #2344: mpi_dst_example_simple_lap_c_facto4_sched0_not_pqrcpbegin ...............***Timeout 287.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.248689e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.899460e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.672025e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.079317e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.648601e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.328617e-01 s Time to initialize coeftab 5.449735e-01 s Time to factorize 1.453610e+01 s ( 1.47 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.672926e-01 s - iteration 1 : total iteration time 0.262 s error 6.3017e-11 Time for refinement 5.792345e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) Start 2344: mpi_dst_example_simple_lap_c_facto4_sched0_not_pqrcpbegin 1941/3626 Test #2345: mpi_dst_example_simple_lap_c_facto4_sched0_not_pqrcpend .................***Timeout 287.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.739570e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.541488e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.108630e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.979120e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.956327e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.017566e+00 s Time to initialize coeftab 1.364986e-01 s Time to factorize 5.309010e+00 s ( 4.01 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.890013e-01 s Time for refinement 2.450019e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515058e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515058e-07 max(|| b_i - A x_i ||_1) 8.568539e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162140e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.568539e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162140e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515058e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515058e-07 max(|| b_i - A x_i ||_1) 8.568539e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162140e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.568539e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162140e+00 (SUCCESS) Start 2345: mpi_dst_example_simple_lap_c_facto4_sched0_not_pqrcpend 1941/3626 Test #2346: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpbegin ..............***Timeout 268.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.009501e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.325439e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.916982e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.848353e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.816313e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.333850e-01 s Time to initialize coeftab 4.353393e-01 s Time to factorize 2.920566e+00 s ( 7.30 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.289432e-01 s - iteration 1 : total iteration time 0.133 s error 6.3017e-11 Time for refinement 2.689627e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) Start 2346: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpbegin 1941/3626 Test #2347: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpend ................***Timeout 268.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.647118e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.143218e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.188629e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.198932e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.876051e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.246820e-01 s Time to initialize coeftab 6.604683e-02 s Time to factorize 1.813622e+00 s (11.75 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.820528e-01 s Time for refinement 2.184956e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515241e-07 max(|| b_i - A x_i ||_1) 8.565799e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161449e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515241e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515241e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515241e-07 max(|| b_i - A x_i ||_1) 8.565799e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161449e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.565799e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161449e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.565799e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.161449e+00 (SUCCESS) Start 2347: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpend 1941/3626 Test #2348: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpbegin ...***Timeout 267.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.443438e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.002791e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.754514e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.737587e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.541269e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.522871e-01 s Time to initialize coeftab 2.531599e-01 s Time to factorize 8.277787e+00 s ( 2.57 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.794994e-01 s - iteration 1 : total iteration time 0.651 s error 6.3017e-11 Time for refinement 1.316575e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.496733e-08 max(|| b_i - A x_i ||_1) 3.175906e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.013915e-01 (SUCCESS) Start 2348: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpbegin 1941/3626 Test #2349: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpend .....***Timeout 267.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.597836e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.334988e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.927657e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.537611e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.814254e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.309720e-01 s Time to initialize coeftab 3.596035e-01 s Time to factorize 3.020691e+00 s ( 7.05 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.263053e-01 s Time for refinement 3.391304e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515204e-07 max(|| b_i - A x_i ||_1) 8.568798e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162206e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515204e-07 max(|| b_i - A x_i ||_1) 8.568798e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162206e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515204e-07 max(|| b_i - A x_i ||_1) 8.568798e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162206e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.515204e-07 max(|| b_i - A x_i ||_1) 8.568798e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.162206e+00 (SUCCESS) Start 2349: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpend Test #1888: c_mpi_rep_example_step-by-step_single_mm2 ...............................***Timeout 201.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: IJV N: 1280 nnz: 12029 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.037653e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 10749 Fill-in of L 0.893590 Time to compute symbol matrix 1.531732e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.723681e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 21498 Fill-in 1.787181 Number of operations in full-rank: LU 1.08 MFlops Prediction: Model AMD 6180 MKL Time to factorize 8.513996e-04 s Time for mapping/scheduling 5.677935e+00 s Time to initialize internal csc 2.928028e-01 s Time to initialize coeftab 3.618866e+00 s Time to factorize 4.159850e+00 s (267.02 KFlop/s) Number of operations 4.80 MFlops Number of static pivots 0 Memory usage of coeftab 427 Ko Time to solve 5.513448e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 3.982265e+01 s || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 max(|| b_i - A x_i ||_1) 1.159050e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.415762e-04 (SUCCESS) Time to solve 4.532490e+00 s Time for refinement 1.160349e+01 s || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081190e-16 max(|| b_i - A x_i ||_1) 1.159166e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.416305e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081190e-16 max(|| b_i - A x_i ||_1) 1.159166e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.416305e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081190e-16 max(|| b_i - A x_i ||_1) 1.159166e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.416305e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081190e-16 max(|| b_i - A x_i ||_1) 1.159166e-18 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.416305e-04 (SUCCESS) Time to initialize internal csc 1.479467e-02 s Time to initialize coeftab 6.860243e-02 s Time to factorize 3.692130e-01 s ( 2.94 MFlop/s) Number of operations 4.80 MFlops Number of static pivots 0 Memory usage of coeftab 427 Ko Time to solve 7.498652e-01 s Time for refinement 3.648430e+00 s || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 || A ||_1 7.256473e-01 max(|| b_i ||_oo) 3.287903e-01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.081183e-16 Test #1903: c_mpi_rep_example_refinement_lap_c_refine_bicgstab_sym ..................***Timeout 201.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.574343e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.665297e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.984441e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.782592e+00 s Time to initialize internal csc 4.593594e-02 s - iteration 1 : total iteration time 3.08 s error 0.067713 - iteration 2 : total iteration time 2.27 s error 0.010548 - iteration 3 : total iteration time 1.48 s error 0.002058 - iteration 4 : total iteration time 1.82 s error 0.00043473 - iteration 5 : total iteration time 1.61 s error 9.1545e-05 - iteration 6 : total iteration time 1.26 s error 1.8273e-05 - iteration 7 : total iteration time 0.972 s error 3.4435e-06 - iteration 8 : total iteration time 1.13 s error 5.9495e-07 Time for refinement 1.566425e+01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.050375e-07 max(|| b_i - A x_i ||_1) 2.586887e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.527489e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.050375e-07 max(|| b_i - A x_i ||_1) 2.586887e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.527489e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.050375e-07 max(|| b_i - A x_i ||_1) 2.586887e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.527489e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.050375e-07 max(|| b_i - A x_i ||_1) 2.586887e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.527489e+00 (SUCCESS) Test #1907: c_mpi_rep_example_refinement_lap_z_refine_cg_sym ........................***Timeout 201.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.238769e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.688855e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.260803e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.068455e+00 s Time to initialize internal csc 3.356757e-02 s - iteration 1 : total iteration time 6.2 s error 0.20457 - iteration 2 : total iteration time 3.4 s error 0.058883 - iteration 3 : total iteration time 3.39 s error 0.018804 - iteration 4 : total iteration time 4.13 s error 0.0064705 - iteration 5 : total iteration time 4.09 s error 0.0022688 - iteration 6 : total iteration time 3.93 s error 0.00080218 - iteration 7 : total iteration time 2.33 s error 0.00027994 - iteration 8 : total iteration time 1.58 s error 9.2911e-05 - iteration 9 : total iteration time 1.32 s error 3.0814e-05 - iteration 10 : total iteration time 1.45 s error 1.0212e-05 - iteration 11 : total iteration time 1.71 s error 3.1309e-06 - iteration 12 : total iteration time 1.42 s error 9.4295e-07 - iteration 13 : total iteration time 1.15 s error 2.8244e-07 - iteration 14 : total iteration time 0.722 s error 8.3271e-08 - iteration 15 : total iteration time 1.29 s error 2.4241e-08 - iteration 16 : total iteration time 1.28 s error 7.1239e-09 - iteration 17 : total iteration time 0.537 s error 1.9923e-09 - iteration 18 : total iteration time 0.601 s error 5.4819e-10 - iteration 19 : total iteration time 0.489 s error 1.674e-10 - iteration 20 : total iteration time 0.71 s error 6.409e-11 - iteration 21 : total iteration time 0.175 s error 2.4264e-11 - iteration 22 : total iteration time 0.361 s error 7.0628e-12 - iteration 23 : total iteration time 0.406 s error 1.9608e-12 - iteration 24 : total iteration time 0.544 s error 5.9077e-13 Time for refinement 4.669781e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.907671e-13 max(|| b_i - A x_i ||_1) 3.424823e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641989e+00 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.907671e-13 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_1) 3.424823e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641989e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.907671e-13 max(|| b_i - A x_i ||_1) 3.424823e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641989e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.907671e-13 max(|| b_i - A x_i ||_1) 3.424823e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641989e+00 (SUCCESS) Test #1912: c_mpi_rep_example_simple_mixed_refine_bicgstab ..........................***Timeout 201.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.057118e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 51109 Fill-in of L 7.452464 Time to compute symbol matrix 7.671413e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.152541e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 102218 Fill-in 14.904929 Number of operations in full-rank: LU 5.50 MFlops Prediction: Model AMD 6180 MKL Time to factorize 7.121319e-04 s Time for mapping/scheduling 1.132175e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.189937e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.045629e-03 s Time to initialize coeftab 5.816398e-02 s Time to factorize 6.864302e-01 s ( 8.01 MFlop/s) Number of operations 9.09 MFlops Number of static pivots 0 Memory usage of coeftab 150 Ko Time to solve 1.021389e+00 s - iteration 1 : total iteration time 2.8 s error 1.393e-15 Time for refinement 3.501233e+00 s || A ||_1 3.076897e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 3.076897e-01 || A ||_1 3.076897e-01 || A ||_1 3.076897e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.416132e-15 max(|| b_i - A x_i ||_1) 4.659670e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.648945e-03 (SUCCESS) max(|| b_i ||_oo) 8.377794e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.416132e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.416132e-15 max(|| b_i - A x_i ||_1) 4.659670e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.648945e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.416132e-15 max(|| b_i - A x_i ||_1) 4.659670e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.648945e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 4.659670e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.648945e-03 (SUCCESS) Test #1919: c_mpi_rep_example_simple_mixed_lap_z_refine_cg_sym ......................***Timeout 201.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.910363e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.332653e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.784955e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.804322e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.268310e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.311209e-02 s Time to initialize coeftab 2.387812e-01 s Time to factorize 1.477587e+00 s (27.05 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 1.284082e+00 s - iteration 1 : total iteration time 2.11 s error 4.8465e-14 Time for refinement 4.095880e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.845823e-14 max(|| b_i - A x_i ||_1) 1.358592e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.428188e-01 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.845823e-14 max(|| b_i - A x_i ||_1) 1.358592e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.428188e-01 (SUCCESS) max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.845823e-14 max(|| b_i - A x_i ||_1) 1.358592e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.428188e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.845823e-14 max(|| b_i - A x_i ||_1) 1.358592e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.428188e-01 (SUCCESS) Test #1935: mpi_rep_example_simple_lap_z_facto2_sched0_1d ...........................***Timeout 201.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.375538e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.090346e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.963721e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.662795e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.832585e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.933024e-02 s Time to initialize coeftab 8.395620e-02 s Time to factorize 3.790875e+00 s (10.54 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Memory usage of coeftab 548 Ko Time to solve 2.756561e-01 s Time for refinement 2.811100e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.207616e-16 max(|| b_i - A x_i ||_1) 1.631578e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.117025e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.207616e-16 max(|| b_i - A x_i ||_1) 1.631578e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.117025e-03 (SUCCESS) || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.207616e-16 max(|| b_i - A x_i ||_1) 1.631578e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.117025e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.207616e-16 max(|| b_i - A x_i ||_1) 1.631578e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.117025e-03 (SUCCESS) Test #1939: mpi_rep_example_simple_lap_s_facto1_sched1_1d ...........................***Timeout 201.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.892252e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.074048e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.168116e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.653845e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.432102e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.951420e-01 s Time to initialize coeftab 3.675769e-01 s Time to factorize 1.341853e+00 s ( 3.90 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 68.5 Ko Time to solve 8.831586e-01 s Time for refinement 1.405085e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.751880e-07 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_1) 7.710990e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.689358e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.751880e-07 max(|| b_i - A x_i ||_1) 7.710990e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.689358e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.751880e-07 max(|| b_i - A x_i ||_1) 7.710990e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.689358e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.751880e-07 max(|| b_i - A x_i ||_1) 7.710990e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.689358e-01 (SUCCESS) Test #1944: mpi_rep_example_simple_lap_c_facto0_sched1_1d ...........................***Timeout 201.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.950309e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.062940e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.545885e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.817147e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.409128e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.814578e-01 s Time to initialize coeftab 8.149938e-02 s Time to factorize 9.448685e-01 s (21.46 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 6.052401e-01 s Time for refinement 7.095945e-01 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.876271e-07 max(|| b_i - A x_i ||_1) 8.494484e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.143412e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.876271e-07 max(|| b_i - A x_i ||_1) 8.494484e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.143412e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.876271e-07 max(|| b_i - A x_i ||_1) 8.494484e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.143412e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.876271e-07 max(|| b_i - A x_i ||_1) 8.494484e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.143412e+00 (SUCCESS) Test #1952: mpi_rep_example_simple_lap_z_facto3_sched1_1d ...........................***Timeout 201.82 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.226499e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.016541e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.116054e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.129390e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.325501e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.217195e-03 s Time to initialize coeftab 3.536294e-01 s Time to factorize 1.407068e+00 s (14.41 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 1.088247e+00 s Time for refinement 2.406291e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.251892e-16 max(|| b_i - A x_i ||_1) 1.810724e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.569069e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.251892e-16 max(|| b_i - A x_i ||_1) 1.810724e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.569069e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.251892e-16 max(|| b_i - A x_i ||_1) 1.810724e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.569069e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.251892e-16 max(|| b_i - A x_i ||_1) 1.810724e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.569069e-03 (SUCCESS) Test #1956: mpi_rep_example_simple_lap_s_facto2_sched4_1d ...........................***Timeout 201.80 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.532515e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.371220e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.707809e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.798142e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.012687e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.428236e-03 s Time to initialize coeftab 3.761057e-01 s Time to factorize 1.973401e+00 s ( 5.06 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 2.566827e+00 s Time for refinement 1.901411e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.710744e-07 max(|| b_i - A x_i ||_1) 7.608856e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.561021e-01 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.710744e-07 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.093036e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_1) 7.608856e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.561021e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.710744e-07 max(|| b_i - A x_i ||_1) 7.608856e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.561021e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.710744e-07 max(|| b_i - A x_i ||_1) 7.608856e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.561021e-01 (SUCCESS) Test #1957: mpi_rep_example_simple_lap_d_facto0_sched4_1d ...........................***Timeout 201.79 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.509596e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.370805e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.969532e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.337971e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.870339e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.636171e-01 s Time to initialize coeftab 6.939961e-02 s Time to factorize 1.102081e+00 s ( 4.59 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 1.454929e+00 s Time for refinement 1.206003e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.034745e-16 max(|| b_i - A x_i ||_1) 1.759456e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.210907e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.034745e-16 max(|| b_i - A x_i ||_1) 1.759456e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.210907e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.034745e-16 max(|| b_i - A x_i ||_1) 1.759456e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.210907e-03 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.034745e-16 max(|| b_i - A x_i ||_1) 1.759456e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.210907e-03 (SUCCESS) Test #1961: mpi_rep_example_simple_lap_c_facto1_sched4_1d ...........................***Timeout 201.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.640464e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.433138e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.124851e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.101428e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.874745e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.109484e-02 s Time to initialize coeftab 3.077802e-01 s Time to factorize 2.709414e+00 s ( 7.86 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 2.416774e+00 s Time for refinement 1.263723e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.835726e-07 max(|| b_i - A x_i ||_1) 8.190268e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.066650e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.835726e-07 max(|| b_i - A x_i ||_1) 8.190268e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.066650e+00 (SUCCESS) || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.835726e-07 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_1) 8.190268e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.066650e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.835726e-07 max(|| b_i - A x_i ||_1) 8.190268e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.066650e+00 (SUCCESS) Test #1964: mpi_rep_example_simple_lap_c_facto4_sched4_1d ...........................***Timeout 201.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.645105e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.536077e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.295546e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.124823e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.809418e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.212278e-03 s Time to initialize coeftab 2.469605e-01 s Time to factorize 2.335909e+00 s ( 9.12 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 2.478517e+00 s Time for refinement 2.045615e+00 s || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.830074e-07 max(|| b_i - A x_i ||_1) 8.034304e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.027295e+00 (SUCCESS) || A ||_1 5.112398e-02 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.830074e-07 max(|| b_i - A x_i ||_1) 8.034304e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.027295e+00 (SUCCESS) max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112398e-02 max(|| b_i ||_oo) 2.468922e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.830074e-07 max(|| b_i - A x_i ||_1) 8.034304e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.027295e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.830074e-07 max(|| b_i - A x_i ||_1) 8.034304e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.027295e+00 (SUCCESS) Test #1971: mpi_dst_example_simple_lap_s_facto1_sched0_1d ...........................***Timeout 201.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.619971e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.868750e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.572630e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.676793e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.845943e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.945737e-01 s Time to initialize coeftab 4.990158e-01 s Time to factorize 1.379275e+00 s ( 3.79 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 68.5 Ko Time to solve 3.887239e-01 s Time for refinement 6.029504e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.728602e-07 max(|| b_i - A x_i ||_1) 7.528493e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.460231e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.728602e-07 max(|| b_i - A x_i ||_1) 7.528493e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.460231e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.728602e-07 max(|| b_i - A x_i ||_1) 7.528493e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.460231e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.728602e-07 max(|| b_i - A x_i ||_1) 7.528493e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.460231e-01 (SUCCESS) Test #1976: mpi_dst_example_simple_lap_c_facto0_sched0_1d ...........................***Timeout 201.71 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.934827e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.272776e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.260385e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.948838e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.182119e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.755839e+00 s Time to initialize coeftab 2.910022e-01 s Time to factorize 5.024997e+00 s ( 4.04 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 8.100005e-01 s Time for refinement 4.637930e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896020e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896020e-07 max(|| b_i - A x_i ||_1) 8.632809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.178358e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.632809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.178358e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896020e-07 max(|| b_i - A x_i ||_1) 8.632809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.178358e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896020e-07 max(|| b_i - A x_i ||_1) 8.632809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.178358e+00 (SUCCESS) Test #1979: mpi_dst_example_simple_lap_c_facto3_sched0_1d ...........................***Timeout 201.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.515027e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.108994e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.062277e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.702801e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.734403e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.044415e-01 s Time to initialize coeftab 2.630158e-01 s Time to factorize 2.674421e+00 s ( 7.58 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 4.170050e-01 s Time for refinement 4.203582e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896020e-07 max(|| b_i - A x_i ||_1) 8.632809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.178358e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896020e-07 max(|| b_i - A x_i ||_1) 8.632809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.178358e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896020e-07 max(|| b_i - A x_i ||_1) 8.632809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.178358e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.896020e-07 max(|| b_i - A x_i ||_1) 8.632809e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.178358e+00 (SUCCESS) Test #1980: mpi_dst_example_simple_lap_c_facto4_sched0_1d ...........................***Timeout 201.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.790597e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.673356e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.975498e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.685739e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.086937e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 8.108939e-01 s Time to initialize coeftab 9.456691e-02 s Time to factorize 2.932006e+00 s ( 7.27 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 6.549725e-01 s Time for refinement 5.741796e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.799215e-07 max(|| b_i - A x_i ||_1) 7.901235e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.993757e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.799215e-07 max(|| b_i - A x_i ||_1) 7.901235e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.993757e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.799215e-07 max(|| b_i - A x_i ||_1) 7.901235e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.993757e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.799215e-07 max(|| b_i - A x_i ||_1) 7.901235e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.993757e+00 (SUCCESS) Test #1981: mpi_dst_example_simple_lap_z_facto0_sched0_1d ...........................***Timeout 201.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.575982e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.275464e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.410799e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.631303e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.032461e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.372142e+00 s Time to initialize coeftab 2.630018e-01 s Time to factorize 5.925019e+00 s ( 3.42 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 8.461159e-01 s Time for refinement 7.865179e-01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.226284e-16 max(|| b_i - A x_i ||_1) 1.861574e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.697384e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.226284e-16 max(|| b_i - A x_i ||_1) 1.861574e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.697384e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.226284e-16 max(|| b_i - A x_i ||_1) 1.861574e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.697384e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.226284e-16 max(|| b_i - A x_i ||_1) 1.861574e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.697384e-03 (SUCCESS) Test #1982: mpi_dst_example_simple_lap_z_facto1_sched0_1d ...........................***Timeout 201.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.012079e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.993368e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.529425e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.192753e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.235369e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.746863e-01 s Time to initialize coeftab 6.150577e-01 s Time to factorize 1.444812e+00 s (14.75 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 1.555729e-01 s Time for refinement 3.142161e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.080963e-16 max(|| b_i - A x_i ||_1) 1.752923e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.423220e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.080963e-16 max(|| b_i - A x_i ||_1) 1.752923e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.423220e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.080963e-16 max(|| b_i - A x_i ||_1) 1.752923e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.423220e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.080963e-16 max(|| b_i - A x_i ||_1) 1.752923e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.423220e-03 (SUCCESS) Test #1983: mpi_dst_example_simple_lap_z_facto2_sched0_1d ...........................***Timeout 201.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.727637e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.325568e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.770946e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.876070e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.245392e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.294100e+00 s Time to initialize coeftab 1.430253e-01 s Time to factorize 6.527008e+00 s ( 6.12 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Memory usage of coeftab 548 Ko Time to solve 3.095215e-01 s Time for refinement 1.006172e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.204884e-16 max(|| b_i - A x_i ||_1) 1.691585e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.268442e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.204884e-16 max(|| b_i - A x_i ||_1) 1.691585e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.268442e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.204884e-16 max(|| b_i - A x_i ||_1) 1.691585e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.268442e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.204884e-16 max(|| b_i - A x_i ||_1) 1.691585e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.268442e-03 (SUCCESS) Test #1986: mpi_dst_example_simple_lap_s_facto0_sched1_1d ...........................***Timeout 201.61 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.210154e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.539855e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.893431e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.848832e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.813424e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.876446e-01 s Time to initialize coeftab 4.916138e-01 s Time to factorize 7.292255e-01 s ( 6.94 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Memory usage of coeftab 68.5 Ko Time to solve 1.679707e+00 s Time for refinement 3.124134e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.919717e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.919717e-07 max(|| b_i - A x_i ||_1) 8.513047e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069741e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.513047e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069741e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.919717e-07 max(|| b_i - A x_i ||_1) 8.513047e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069741e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.919717e-07 max(|| b_i - A x_i ||_1) 8.513047e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069741e+00 (SUCCESS) Test #1991: mpi_dst_example_simple_lap_d_facto2_sched1_1d ...........................***Timeout 201.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.898882e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.338233e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.346029e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.386989e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.207977e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.126846e+00 s Time to initialize coeftab 3.360847e-01 s Time to factorize 1.233198e+00 s ( 8.10 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Memory usage of coeftab 274 Ko Time to solve 3.479999e-01 s Time for refinement 7.761408e-01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.984347e-16 max(|| b_i - A x_i ||_1) 1.642215e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.063584e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.984347e-16 max(|| b_i - A x_i ||_1) 1.642215e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.063584e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.984347e-16 max(|| b_i - A x_i ||_1) 1.642215e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.063584e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.984347e-16 max(|| b_i - A x_i ||_1) 1.642215e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.063584e-03 (SUCCESS) Test #1992: mpi_dst_example_simple_lap_c_facto0_sched1_1d ...........................***Timeout 201.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.728371e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.385233e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.336565e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.169939e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.909794e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.468579e-01 s Time to initialize coeftab 7.717813e-02 s Time to factorize 5.195752e-01 s (39.03 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 5.334010e-01 s Time for refinement 8.092280e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.930759e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.930759e-07 max(|| b_i - A x_i ||_1) 8.742141e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.205946e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.930759e-07 max(|| b_i - A x_i ||_1) 8.742141e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.205946e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.742141e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.205946e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.930759e-07 max(|| b_i - A x_i ||_1) 8.742141e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.205946e+00 (SUCCESS) Test #1993: mpi_dst_example_simple_lap_c_facto1_sched1_1d ...........................***Timeout 201.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.175040e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.799705e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.309643e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.161442e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.560053e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.539189e-01 s Time to initialize coeftab 1.440746e-01 s Time to factorize 2.302019e+00 s ( 9.26 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Memory usage of coeftab 137 Ko Time to solve 4.700358e-01 s Time for refinement 8.297574e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.806602e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.806602e-07 max(|| b_i - A x_i ||_1) 7.899691e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.993367e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.806602e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.806602e-07 max(|| b_i - A x_i ||_1) 7.899691e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.993367e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 7.899691e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.993367e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 7.899691e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.993367e+00 (SUCCESS) Test #2003: mpi_dst_example_simple_lap_s_facto1_sched4_1d ...........................***Timeout 201.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.062651e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.350419e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.495462e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.608970e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.450360e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.455764e+00 s Time to initialize coeftab 1.307961e+00 s Time to factorize 4.786206e+00 s ( 1.09 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Memory usage of coeftab 68.5 Ko Time to solve 5.085725e+00 s Time for refinement 1.893938e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.706161e-07 max(|| b_i - A x_i ||_1) 7.493117e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.415777e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.706161e-07 max(|| b_i - A x_i ||_1) 7.493117e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.415777e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.706161e-07 max(|| b_i - A x_i ||_1) 7.493117e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.415777e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.706161e-07 max(|| b_i - A x_i ||_1) 7.493117e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.415777e-01 (SUCCESS) Test #2104: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpbegin ...***Timeout 201.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.738503e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.115606e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.303218e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.196893e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.939758e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.865902e-01 s Time to initialize coeftab 9.577046e-01 s Time to factorize 1.010574e+01 s (1011.72 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 5.068923e-02 s - iteration 1 : total iteration time 0.0787 s error 9.1452e-11 Time for refinement 1.876456e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.149870e-08 max(|| b_i - A x_i ||_1) 2.941496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.696256e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.149870e-08 max(|| b_i - A x_i ||_1) 2.941496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.696256e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.149870e-08 max(|| b_i - A x_i ||_1) 2.941496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.696256e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.149870e-08 max(|| b_i - A x_i ||_1) 2.941496e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.696256e-01 (SUCCESS) Start 2104: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpbegin Test #2105: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpend .....***Timeout 201.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.906997e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.607300e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.510266e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.682656e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.310133e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.285718e+00 s Time to initialize coeftab 1.344193e-01 s Time to factorize 8.226592e+00 s ( 1.21 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.048246e-01 s - iteration 1 : total iteration time 0.169 s error 1.6104e-12 Time for refinement 3.436553e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.655580e-08 max(|| b_i - A x_i ||_1) 2.718559e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.416114e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.655580e-08 max(|| b_i - A x_i ||_1) 2.718559e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.416114e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.655580e-08 max(|| b_i - A x_i ||_1) 2.718559e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.416114e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.655580e-08 max(|| b_i - A x_i ||_1) 2.718559e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.416114e-01 (SUCCESS) Start 2105: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpend Test #2106: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrrtbegin ...............***Timeout 201.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.761753e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.680300e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.053378e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.408848e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.066086e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.313924e-01 s Time to initialize coeftab 3.997041e-01 s Time to factorize 3.610361e+00 s ( 2.77 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.050686e-01 s - iteration 1 : total iteration time 0.265 s error 9.0951e-11 Time for refinement 6.159118e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.013796e-08 max(|| b_i - A x_i ||_1) 2.908470e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.654756e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.013796e-08 max(|| b_i - A x_i ||_1) 2.908470e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.654756e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.013796e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.013796e-08 max(|| b_i - A x_i ||_1) 2.908470e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.654756e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.908470e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.654756e-01 (SUCCESS) Start 2106: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrrtbegin Test #2107: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrrtend .................***Timeout 201.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.030904e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.811308e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.733630e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.712266e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.207784e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.554958e-01 s Time to initialize coeftab 1.204577e-01 s Time to factorize 1.857124e+00 s ( 5.38 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 9.422273e-02 s - iteration 1 : total iteration time 0.0842 s error 3.2754e-12 Time for refinement 2.684296e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.752316e-08 max(|| b_i - A x_i ||_1) 2.802752e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521912e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.752316e-08 max(|| b_i - A x_i ||_1) 2.802752e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521912e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.752316e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.752316e-08 max(|| b_i - A x_i ||_1) 2.802752e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521912e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.802752e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521912e-01 (SUCCESS) Start 2107: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrrtend Test #2108: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrrtbegin ..............***Timeout 201.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.998072e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.928337e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.913342e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.231464e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.275699e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.764951e-01 s Time to initialize coeftab 6.069966e-01 s Time to factorize 8.326003e+00 s ( 1.20 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 5.955910e-01 s - iteration 1 : total iteration time 0.399 s error 9.0946e-11 Time for refinement 9.348258e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.019128e-08 max(|| b_i - A x_i ||_1) 2.914012e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.661720e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.019128e-08 max(|| b_i - A x_i ||_1) 2.914012e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.661720e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.019128e-08 max(|| b_i - A x_i ||_1) 2.914012e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.661720e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.019128e-08 max(|| b_i - A x_i ||_1) 2.914012e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.661720e-01 (SUCCESS) Start 2108: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrrtbegin Test #2109: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrrtend ................***Timeout 201.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.304479e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.566095e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.003329e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.383973e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.789988e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.691601e-01 s Time to initialize coeftab 4.942931e-01 s Time to factorize 3.076781e+00 s ( 3.25 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.619984e-01 s - iteration 1 : total iteration time 0.218 s error 3.153e-12 Time for refinement 3.801454e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.765335e-08 max(|| b_i - A x_i ||_1) 2.804532e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.524148e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.765335e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.765335e-08 max(|| b_i - A x_i ||_1) 2.804532e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.524148e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.765335e-08 max(|| b_i - A x_i ||_1) 2.804532e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.524148e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.804532e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.524148e-01 (SUCCESS) Start 2109: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrrtend Test #2110: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtbegin ...***Timeout 201.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.727028e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.296258e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.726775e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.293562e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.796151e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.898914e-02 s Time to initialize coeftab 4.370923e-01 s Time to factorize 1.220958e+01 s (837.39 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 4.966886e-03 s - iteration 1 : total iteration time 0.00225 s error 9.106e-11 Time for refinement 8.413522e-03 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.010074e-08 max(|| b_i - A x_i ||_1) 2.905459e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.650972e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.010074e-08 max(|| b_i - A x_i ||_1) 2.905459e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.650972e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.010074e-08 max(|| b_i - A x_i ||_1) 2.905459e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.650972e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.010074e-08 max(|| b_i - A x_i ||_1) 2.905459e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.650972e-01 (SUCCESS) Start 2110: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtbegin Test #2111: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtend .....***Timeout 201.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : 1: 300 1140 2: 200 760 3: 200 660 Ordering method is: Scotch Time to compute ordering 1.916804e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.385074e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.284286e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.502532e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.117645e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.946518e-01 s Time to initialize coeftab 3.443833e-01 s Time to factorize 5.101743e+00 s ( 1.96 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.165487e-01 s - iteration 1 : total iteration time 0.547 s error 3.153e-12 Time for refinement 1.068482e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.765335e-08 max(|| b_i - A x_i ||_1) 2.804532e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.524148e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.765335e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.765335e-08 max(|| b_i - A x_i ||_1) 2.804532e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.524148e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.804532e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.524148e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.765335e-08 max(|| b_i - A x_i ||_1) 2.804532e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.524148e-01 (SUCCESS) Start 2111: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtend Test #2112: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpilu0 ...............***Timeout 201.16 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.808364e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.027747e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.524163e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.896600e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.048052e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.719100e-01 s Time to initialize coeftab 4.423726e-01 s Time to factorize 3.963185e+00 s ( 2.52 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.937242e-01 s - iteration 1 : total iteration time 0.373 s error 7.7243e-11 Time for refinement 8.666602e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.943931e-08 max(|| b_i - A x_i ||_1) 2.927990e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679284e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.943931e-08 max(|| b_i - A x_i ||_1) 2.927990e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679284e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.943931e-08 max(|| b_i - A x_i ||_1) 2.927990e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679284e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.943931e-08 max(|| b_i - A x_i ||_1) 2.927990e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679284e-01 (SUCCESS) Start 2112: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpilu0 Test #2113: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpilu1 ...............***Timeout 201.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.982657e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.305928e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.296692e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.308950e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.913701e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.707539e-02 s Time to initialize coeftab 8.345917e-02 s Time to factorize 3.245515e+00 s ( 3.08 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 9.900550e-02 s - iteration 1 : total iteration time 0.0619 s error 7.7243e-11 Time for refinement 2.327083e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.943931e-08 max(|| b_i - A x_i ||_1) 2.927990e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679284e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.943931e-08 max(|| b_i - A x_i ||_1) 2.927990e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679284e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.943931e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.943931e-08 max(|| b_i - A x_i ||_1) 2.927990e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679284e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.927990e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.679284e-01 (SUCCESS) Start 2113: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpilu1 Test #2114: mpi_dst_example_simple_lap_d_facto0_sched0_not_svdbegin .................***Timeout 201.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.915968e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.286227e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.243184e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.635377e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.077796e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.064111e-01 s Time to initialize coeftab 4.733182e-01 s Time to factorize 7.300118e+00 s (710.10 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.456615e-01 s - iteration 1 : total iteration time 0.542 s error 1.969e-14 Time for refinement 1.138163e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969102e-14 max(|| b_i - A x_i ||_1) 3.898355e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.898618e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969102e-14 max(|| b_i - A x_i ||_1) 3.898355e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.898618e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969102e-14 max(|| b_i - A x_i ||_1) 3.898355e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.898618e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969102e-14 max(|| b_i - A x_i ||_1) 3.898355e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.898618e-02 (SUCCESS) Start 2114: mpi_dst_example_simple_lap_d_facto0_sched0_not_svdbegin Test #2115: mpi_dst_example_simple_lap_d_facto0_sched0_not_svdend ...................***Timeout 201.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.835663e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.896784e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.526447e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.619957e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.317370e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.582676e-01 s Time to initialize coeftab 2.090990e-01 s Time to factorize 2.537020e+00 s ( 2.00 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.675514e-01 s - iteration 1 : total iteration time 0.84 s error 2.4e-16 Time for refinement 1.873211e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.678941e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.678941e-16 max(|| b_i - A x_i ||_1) 7.102672e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.925118e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.102672e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.925118e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.678941e-16 max(|| b_i - A x_i ||_1) 7.102672e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.925118e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.678941e-16 max(|| b_i - A x_i ||_1) 7.102672e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.925118e-04 (SUCCESS) Start 2115: mpi_dst_example_simple_lap_d_facto0_sched0_not_svdend Test #2116: mpi_dst_example_simple_lap_d_facto0_sched0_kway_svdbegin ................***Timeout 201.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.112609e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.614002e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.211902e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.335212e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.425750e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.014964e-01 s Time to initialize coeftab 4.512874e-01 s Time to factorize 1.300000e+01 s (398.75 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.232671e-02 s - iteration 1 : total iteration time 0.0486 s error 1.969e-14 Time for refinement 1.085307e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969102e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969102e-14 max(|| b_i - A x_i ||_1) 3.898355e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.898618e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969102e-14 max(|| b_i - A x_i ||_1) 3.898355e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.898618e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.898355e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.898618e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969102e-14 max(|| b_i - A x_i ||_1) 3.898355e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.898618e-02 (SUCCESS) Start 2116: mpi_dst_example_simple_lap_d_facto0_sched0_kway_svdbegin Test #2117: mpi_dst_example_simple_lap_d_facto0_sched0_kway_svdend ..................***Timeout 201.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.771570e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.372372e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.186419e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.047144e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.243240e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.140199e+00 s Time to initialize coeftab 2.124334e-01 s Time to factorize 7.728022e+00 s (670.78 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.561451e-02 s - iteration 1 : total iteration time 0.0218 s error 2.4e-16 Time for refinement 5.611046e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.677991e-16 max(|| b_i - A x_i ||_1) 7.090464e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.909778e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.677991e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.677991e-16 max(|| b_i - A x_i ||_1) 7.090464e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.909778e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.090464e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.909778e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.677991e-16 max(|| b_i - A x_i ||_1) 7.090464e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.909778e-04 (SUCCESS) Start 2117: mpi_dst_example_simple_lap_d_facto0_sched0_kway_svdend Test #2118: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_svdbegin .....***Timeout 201.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.766219e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.672874e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.142314e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.148996e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.884372e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.854930e-03 s Time to initialize coeftab 4.291885e-01 s Time to factorize 1.096031e+00 s ( 4.62 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.067650e-03 s - iteration 1 : total iteration time 0.00231 s error 1.969e-14 Time for refinement 8.854940e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969102e-14 max(|| b_i - A x_i ||_1) 3.898355e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.898618e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969102e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969102e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.969102e-14 max(|| b_i - A x_i ||_1) 3.898355e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.898618e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.898355e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.898618e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.898355e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.898618e-02 (SUCCESS) Start 2118: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_svdbegin Test #2119: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_svdend .......***Timeout 201.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.676499e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.702315e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.547069e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.555802e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.758200e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.445315e-02 s Time to initialize coeftab 2.502814e-01 s Time to factorize 2.489463e-01 s (20.33 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.983757e-03 s - iteration 1 : total iteration time 0.00543 s error 2.4e-16 Time for refinement 1.765440e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.677991e-16 max(|| b_i - A x_i ||_1) 7.090464e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.909778e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.677991e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.677991e-16 max(|| b_i - A x_i ||_1) 7.090464e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.909778e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.677991e-16 max(|| b_i - A x_i ||_1) 7.090464e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.909778e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.090464e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.909778e-04 (SUCCESS) Start 2119: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_svdend Test #2120: mpi_dst_example_simple_lap_d_facto0_sched0_not_pqrcpbegin ...............***Timeout 200.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.965444e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.527912e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.599041e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.131468e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.140134e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.275156e-01 s Time to initialize coeftab 1.439517e-01 s Time to factorize 3.993016e+00 s ( 1.27 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.177387e-01 s - iteration 1 : total iteration time 0.979 s error 1.52e-14 Time for refinement 1.780260e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519462e-14 max(|| b_i - A x_i ||_1) 2.802562e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521660e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519462e-14 max(|| b_i - A x_i ||_1) 2.802562e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521660e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519462e-14 max(|| b_i - A x_i ||_1) 2.802562e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521660e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519462e-14 max(|| b_i - A x_i ||_1) 2.802562e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521660e-02 (SUCCESS) Start 2120: mpi_dst_example_simple_lap_d_facto0_sched0_not_pqrcpbegin Test #2121: mpi_dst_example_simple_lap_d_facto0_sched0_not_pqrcpend .................***Timeout 200.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.907934e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.987448e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.604031e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.847145e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.350943e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.033174e+00 s Time to initialize coeftab 2.790878e-01 s Time to factorize 2.675676e+00 s ( 1.89 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.874758e-01 s - iteration 1 : total iteration time 0.71 s error 2.4333e-16 Time for refinement 1.569029e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.688894e-16 max(|| b_i - A x_i ||_1) 6.842253e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.597879e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.688894e-16 max(|| b_i - A x_i ||_1) 6.842253e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.597879e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.688894e-16 max(|| b_i - A x_i ||_1) 6.842253e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.597879e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.688894e-16 max(|| b_i - A x_i ||_1) 6.842253e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.597879e-04 (SUCCESS) Start 2121: mpi_dst_example_simple_lap_d_facto0_sched0_not_pqrcpend Test #2122: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpbegin ..............***Timeout 200.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.202515e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.042078e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.693320e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.223953e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.269598e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.285041e-01 s Time to initialize coeftab 2.038661e-01 s Time to factorize 2.071623e+00 s ( 2.44 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.770738e-01 s - iteration 1 : total iteration time 0.185 s error 1.52e-14 Time for refinement 5.169292e-01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519483e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519483e-14 max(|| b_i - A x_i ||_1) 2.802753e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521900e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519483e-14 max(|| b_i - A x_i ||_1) 2.802753e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521900e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519483e-14 max(|| b_i - A x_i ||_1) 2.802753e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521900e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.802753e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521900e-02 (SUCCESS) Start 2122: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpbegin Test #2123: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpend ................***Timeout 200.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.707624e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.556766e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.501387e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.409912e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.897611e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.504108e-01 s Time to initialize coeftab 1.092423e-01 s Time to factorize 2.358070e+00 s ( 2.15 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.599898e-01 s - iteration 1 : total iteration time 1.06 s error 2.4333e-16 Time for refinement 1.999503e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.688894e-16 max(|| b_i - A x_i ||_1) 6.842253e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.597879e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.688894e-16 max(|| b_i - A x_i ||_1) 6.842253e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.597879e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.688894e-16 max(|| b_i - A x_i ||_1) 6.842253e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.597879e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.688894e-16 max(|| b_i - A x_i ||_1) 6.842253e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.597879e-04 (SUCCESS) Start 2123: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpend Test #2124: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpbegin ...***Timeout 200.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.022915e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.159615e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.887566e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.631076e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.506480e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.933164e-01 s Time to initialize coeftab 3.339607e-01 s Time to factorize 6.928169e+00 s (748.22 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.729950e-01 s - iteration 1 : total iteration time 0.667 s error 1.52e-14 Time for refinement 1.241373e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519462e-14 max(|| b_i - A x_i ||_1) 2.802562e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521660e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519462e-14 max(|| b_i - A x_i ||_1) 2.802562e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521660e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519462e-14 max(|| b_i - A x_i ||_1) 2.802562e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521660e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.519462e-14 max(|| b_i - A x_i ||_1) 2.802562e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.521660e-02 (SUCCESS) Start 2124: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpbegin Test #2125: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpend .....***Timeout 200.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.486833e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.305249e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.068673e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.050157e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.811597e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.479138e-01 s Time to initialize coeftab 5.168471e-01 s Time to factorize 2.163063e+00 s ( 2.34 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.382758e-01 s - iteration 1 : total iteration time 0.406 s error 2.4333e-16 Time for refinement 1.084596e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.688894e-16 max(|| b_i - A x_i ||_1) 6.842253e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.597879e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.688894e-16 max(|| b_i - A x_i ||_1) 6.842253e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.597879e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.688894e-16 max(|| b_i - A x_i ||_1) 6.842253e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.597879e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.688894e-16 max(|| b_i - A x_i ||_1) 6.842253e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.597879e-04 (SUCCESS) Start 2125: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpend Test #2126: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrcpbegin ...............***Timeout 200.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.811739e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.621871e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.556410e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.391867e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.896898e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.767607e-02 s Time to initialize coeftab 2.422926e-01 s Time to factorize 5.530108e-01 s ( 9.15 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.801520e-03 s - iteration 1 : total iteration time 0.00252 s error 3.6959e-14 Time for refinement 9.273957e-03 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695351e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695351e-14 max(|| b_i - A x_i ||_1) 6.876750e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641227e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695351e-14 max(|| b_i - A x_i ||_1) 6.876750e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641227e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695351e-14 max(|| b_i - A x_i ||_1) 6.876750e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641227e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 6.876750e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641227e-02 (SUCCESS) Start 2126: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrcpbegin Test #2127: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrcpend .................***Timeout 200.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.596082e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.905174e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.836856e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.385450e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.993183e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.414058e-01 s Time to initialize coeftab 1.101066e-01 s Time to factorize 2.771727e+00 s ( 1.83 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.181560e-01 s - iteration 1 : total iteration time 0.21 s error 2.4483e-16 Time for refinement 3.290208e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.726731e-16 max(|| b_i - A x_i ||_1) 7.162566e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.000379e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.726731e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.726731e-16 max(|| b_i - A x_i ||_1) 7.162566e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.000379e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.162566e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.000379e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.726731e-16 max(|| b_i - A x_i ||_1) 7.162566e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.000379e-04 (SUCCESS) Start 2127: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrcpend Test #2128: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrcpbegin ..............***Timeout 200.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.344185e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.045977e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.896906e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.479601e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.932669e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.410431e-01 s Time to initialize coeftab 6.100044e-01 s Time to factorize 2.344864e+01 s (221.07 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.779978e-01 s - iteration 1 : total iteration time 0.691 s error 3.6959e-14 Time for refinement 1.359074e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695335e-14 max(|| b_i - A x_i ||_1) 6.876783e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641268e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695335e-14 max(|| b_i - A x_i ||_1) 6.876783e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641268e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695335e-14 max(|| b_i - A x_i ||_1) 6.876783e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641268e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695335e-14 max(|| b_i - A x_i ||_1) 6.876783e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641268e-02 (SUCCESS) Start 2128: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrcpbegin Test #2129: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrcpend ................***Timeout 200.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.200618e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.091252e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.003141e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.794725e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.538291e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.451142e+00 s Time to initialize coeftab 3.562683e-01 s Time to factorize 5.584755e+00 s (928.20 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.750069e-01 s - iteration 1 : total iteration time 1.16 s error 2.4483e-16 Time for refinement 2.213555e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.726731e-16 max(|| b_i - A x_i ||_1) 7.162566e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.000379e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.726731e-16 max(|| b_i - A x_i ||_1) 7.162566e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.000379e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.726731e-16 max(|| b_i - A x_i ||_1) 7.162566e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.000379e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.726731e-16 max(|| b_i - A x_i ||_1) 7.162566e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.000379e-04 (SUCCESS) Start 2129: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrcpend Test #2130: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpbegin ...***Timeout 200.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.475160e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.676410e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.517564e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.611139e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.728968e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.829913e-01 s Time to initialize coeftab 3.223838e-01 s Time to factorize 6.377053e+00 s (812.88 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.394344e-01 s - iteration 1 : total iteration time 0.234 s error 3.6959e-14 Time for refinement 5.611499e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695351e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695351e-14 max(|| b_i - A x_i ||_1) 6.876750e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641227e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 6.876750e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641227e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695351e-14 max(|| b_i - A x_i ||_1) 6.876750e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641227e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695351e-14 max(|| b_i - A x_i ||_1) 6.876750e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641227e-02 (SUCCESS) Start 2130: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpbegin Test #2131: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpend .....***Timeout 200.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.438498e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.393552e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.773988e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.874202e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.698047e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.887927e-01 s Time to initialize coeftab 2.362803e-01 s Time to factorize 4.919838e+00 s ( 1.03 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.414415e-01 s - iteration 1 : total iteration time 0.965 s error 2.4483e-16 Time for refinement 2.566856e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.725851e-16 max(|| b_i - A x_i ||_1) 7.156992e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.993376e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.725851e-16 max(|| b_i - A x_i ||_1) 7.156992e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.993376e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.725851e-16 max(|| b_i - A x_i ||_1) 7.156992e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.993376e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.725851e-16 max(|| b_i - A x_i ||_1) 7.156992e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.993376e-04 (SUCCESS) Start 2131: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpend Test #2132: mpi_dst_example_simple_lap_d_facto0_sched0_not_tqrcpbegin ...............***Timeout 200.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.078259e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.566633e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.128844e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.357431e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.265351e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.068059e-01 s Time to initialize coeftab 2.190186e-01 s Time to factorize 8.724094e+00 s (594.19 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.396145e-01 s - iteration 1 : total iteration time 0.535 s error 3.6959e-14 Time for refinement 1.005663e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695274e-14 max(|| b_i - A x_i ||_1) 6.875946e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.640216e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695274e-14 max(|| b_i - A x_i ||_1) 6.875946e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.640216e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695274e-14 max(|| b_i - A x_i ||_1) 6.875946e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.640216e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695274e-14 max(|| b_i - A x_i ||_1) 6.875946e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.640216e-02 (SUCCESS) Start 2132: mpi_dst_example_simple_lap_d_facto0_sched0_not_tqrcpbegin Test #2133: mpi_dst_example_simple_lap_d_facto0_sched0_not_tqrcpend .................***Timeout 200.61 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.072972e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.216258e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.386442e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.046543e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.358714e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.401783e-01 s Time to initialize coeftab 2.157787e-01 s Time to factorize 2.849869e+00 s ( 1.78 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.435667e-01 s - iteration 1 : total iteration time 0.244 s error 2.4483e-16 Time for refinement 5.420156e-01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.724130e-16 max(|| b_i - A x_i ||_1) 7.145535e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.978978e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.724130e-16 max(|| b_i - A x_i ||_1) 7.145535e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.978978e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.724130e-16 max(|| b_i - A x_i ||_1) 7.145535e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.978978e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.724130e-16 max(|| b_i - A x_i ||_1) 7.145535e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.978978e-04 (SUCCESS) Start 2133: mpi_dst_example_simple_lap_d_facto0_sched0_not_tqrcpend Test #2134: mpi_dst_example_simple_lap_d_facto0_sched0_kway_tqrcpbegin ..............***Timeout 200.59 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.579030e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.509566e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.005145e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.891005e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.780519e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.816764e-01 s Time to initialize coeftab 3.496985e+00 s Time to factorize 1.997245e+01 s (259.55 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.748971e-01 s - iteration 1 : total iteration time 0.254 s error 3.6959e-14 Time for refinement 7.100409e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695328e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695328e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695328e-14 max(|| b_i - A x_i ||_1) 6.876824e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641320e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.695328e-14 max(|| b_i - A x_i ||_1) 6.876824e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641320e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 6.876824e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641320e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 6.876824e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.641320e-02 (SUCCESS) Start 2134: mpi_dst_example_simple_lap_d_facto0_sched0_kway_tqrcpbegin Start 2388: mpi_dst_example_simple_lap_z_facto0_sched0_not_tqrcpbegin Start 2389: mpi_dst_example_simple_lap_z_facto0_sched0_not_tqrcpend Start 2390: mpi_dst_example_simple_lap_z_facto0_sched0_kway_tqrcpbegin Start 2391: mpi_dst_example_simple_lap_z_facto0_sched0_kway_tqrcpend Start 2392: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpbegin Start 2393: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpend Start 2394: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrrtbegin Start 2395: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrrtend Start 2396: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrrtbegin Start 2397: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrrtend Start 2398: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtbegin Start 2399: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtend Start 2400: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpilu0 Start 2401: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpilu1 Start 2402: mpi_dst_example_simple_lap_z_facto1_sched0_not_svdbegin Start 2403: mpi_dst_example_simple_lap_z_facto1_sched0_not_svdend Start 2404: mpi_dst_example_simple_lap_z_facto1_sched0_kway_svdbegin Start 2405: mpi_dst_example_simple_lap_z_facto1_sched0_kway_svdend Start 2406: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_svdbegin Start 2407: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_svdend Start 2408: mpi_dst_example_simple_lap_z_facto1_sched0_not_pqrcpbegin Start 2409: mpi_dst_example_simple_lap_z_facto1_sched0_not_pqrcpend Start 2410: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpbegin Start 2411: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpend Start 2412: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpbegin Start 2413: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpend Start 2414: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrcpbegin Start 2415: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrcpend Start 2416: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrcpbegin Start 2417: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrcpend Start 2418: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpbegin Start 2419: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpend Start 2420: mpi_dst_example_simple_lap_z_facto1_sched0_not_tqrcpbegin Start 2421: mpi_dst_example_simple_lap_z_facto1_sched0_not_tqrcpend Start 2422: mpi_dst_example_simple_lap_z_facto1_sched0_kway_tqrcpbegin Start 2423: mpi_dst_example_simple_lap_z_facto1_sched0_kway_tqrcpend Start 2424: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpbegin Start 2425: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpend Start 2426: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrrtbegin Start 2427: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrrtend Start 2428: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrrtbegin Start 2429: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrrtend Start 2430: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtbegin Start 2431: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtend Start 2432: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpilu0 Start 2433: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpilu1 Start 2434: mpi_dst_example_simple_lap_z_facto2_sched0_not_svdbegin Start 2435: mpi_dst_example_simple_lap_z_facto2_sched0_not_svdend Start 2436: mpi_dst_example_simple_lap_z_facto2_sched0_kway_svdbegin Start 2437: mpi_dst_example_simple_lap_z_facto2_sched0_kway_svdend Start 2438: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_svdbegin Start 2439: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_svdend Start 2440: mpi_dst_example_simple_lap_z_facto2_sched0_not_pqrcpbegin Start 2441: mpi_dst_example_simple_lap_z_facto2_sched0_not_pqrcpend Start 2442: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpbegin Start 2443: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpend Start 2444: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpbegin Start 2445: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpend Start 2446: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrcpbegin Start 2447: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrcpend Start 2448: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrcpbegin Start 2449: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrcpend Start 2450: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpbegin Start 2451: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpend Start 2452: mpi_dst_example_simple_lap_z_facto2_sched0_not_tqrcpbegin Start 2453: mpi_dst_example_simple_lap_z_facto2_sched0_not_tqrcpend Start 2454: mpi_dst_example_simple_lap_z_facto2_sched0_kway_tqrcpbegin Start 2455: mpi_dst_example_simple_lap_z_facto2_sched0_kway_tqrcpend Start 2456: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpbegin Start 2457: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpend Start 2458: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrrtbegin Start 2459: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrrtend Start 2460: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrrtbegin Start 2461: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrrtend Start 2462: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtbegin Start 2463: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtend Start 2464: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpilu0 Start 2465: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpilu1 Start 2466: mpi_dst_example_simple_lap_z_facto3_sched0_not_svdbegin Start 2467: mpi_dst_example_simple_lap_z_facto3_sched0_not_svdend Start 2468: mpi_dst_example_simple_lap_z_facto3_sched0_kway_svdbegin Start 2469: mpi_dst_example_simple_lap_z_facto3_sched0_kway_svdend Start 2470: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_svdbegin Start 2471: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_svdend Start 2472: mpi_dst_example_simple_lap_z_facto3_sched0_not_pqrcpbegin Start 2473: mpi_dst_example_simple_lap_z_facto3_sched0_not_pqrcpend Start 2474: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpbegin Start 2475: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpend Start 2476: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpbegin Start 2477: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpend 1966/3626 Test #2376: mpi_dst_example_simple_lap_z_facto0_sched0_not_pqrcpbegin ............... Passed 199.09 sec 1967/3626 Test #2355: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_rqrcpend ..... Passed 200.76 sec 1968/3626 Test #2384: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrcpbegin .............. Passed 198.96 sec 1969/3626 Test #2366: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_rqrrtbegin ... Passed 199.61 sec 1970/3626 Test #2365: mpi_dst_example_simple_lap_c_facto4_sched0_kway_rqrrtend ................ Passed 199.62 sec 1971/3626 Test #2375: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_svdend ....... Passed 199.11 sec 1972/3626 Test #2379: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpend ................ Passed 199.05 sec 1973/3626 Test #2358: mpi_dst_example_simple_lap_c_facto4_sched0_kway_tqrcpbegin .............. Passed 200.39 sec 1974/3626 Test #2380: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_pqrcpbegin ... Passed 199.04 sec 1975/3626 Test #2382: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrcpbegin ............... Passed 199.02 sec 1976/3626 Test #2356: mpi_dst_example_simple_lap_c_facto4_sched0_not_tqrcpbegin ...............***Failed 200.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 [arch-nspawn-3655178:1351614] *** Process received signal *** [arch-nspawn-3655178:1351614] Signal: Segmentation fault (11) [arch-nspawn-3655178:1351614] Signal code: Address not mapped (1) [arch-nspawn-3655178:1351614] Failing at address: 0x7f3e980a2860 [arch-nspawn-3655178:1351614] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7f5a6d3966cc] [arch-nspawn-3655178:1351614] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7f5a5f263a02] [arch-nspawn-3655178:1351614] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7f5a5f264504] [arch-nspawn-3655178:1351614] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7f5a5f215a7a] [arch-nspawn-3655178:1351614] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7f5a5f242aa2] [arch-nspawn-3655178:1351614] [ 5] /usr/lib/libmpi.so.40(+0x7de1a) [0x7f5a6c27de1a] [arch-nspawn-3655178:1351614] [ 6] /usr/lib/libmpi.so.40(ompi_request_default_wait+0x1a) [0x7f5a6c28019c] [arch-nspawn-3655178:1351614] [ 7] /usr/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0x98) [0x7f5a6c2f03e8] [arch-nspawn-3655178:1351614] [ 8] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_recursivedoubling+0x210) [0x7f5a6c2f1a88] [arch-nspawn-3655178:1351614] [ 9] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_ring+0x3fc) [0x7f5a6c2f443c] [arch-nspawn-3655178:1351614] [10] /usr/lib/libmpi.so.40(ompi_coll_tuned_allreduce_intra_dec_fixed+0x40) [0x7f5a6c315152] [arch-nspawn-3655178:1351614] [11] /usr/lib/libmpi.so.40(MPI_Allreduce+0x294) [0x7f5a6c28e584] [arch-nspawn-3655178:1351614] [12] /build/pastix/src/build/spm/src/libspm.so.1(spmUpdateComputedFields+0x140) [0x7f5a6c5dc458] [arch-nspawn-3655178:1351614] [13] /build/pastix/src/build/spm/src/libspm.so.1(genLaplacian+0xaa) [0x7f5a6c5e521e] [arch-nspawn-3655178:1351614] [14] /build/pastix/src/build/spm/src/libspm.so.1(+0x409c8) [0x7f5a6c5e69c8] [arch-nspawn-3655178:1351614] [15] ./simple(+0xe2c) [0x555555556e2c] [arch-nspawn-3655178:1351614] [16] /usr/lib/libc.so.6(+0x27fae) [0x7f5a6c0a4fae] [arch-nspawn-3655178:1351614] [17] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7f5a6c0a50b8] [arch-nspawn-3655178:1351614] [18] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:1351614] *** End of error message *** [arch-nspawn-3655178:1351488] *** Process received signal *** [arch-nspawn-3655178:1351488] Signal: Segmentation fault (11) [arch-nspawn-3655178:1351488] Signal code: Address not mapped (1) [arch-nspawn-3655178:1351488] Failing at address: 0x7fa2395a12e0 [arch-nspawn-3655178:1351488] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7f28b16b96cc] [arch-nspawn-3655178:1351488] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7f28a34f3a02] [arch-nspawn-3655178:1351488] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7f28a34f4504] [arch-nspawn-3655178:1351488] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7f28a34a5a7a] [arch-nspawn-3655178:1351488] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7f28a34d2aa2] [arch-nspawn-3655178:1351488] [ 5] /usr/lib/libmpi.so.40(+0x7de1a) [0x7f28a3c7de1a] [arch-nspawn-3655178:1351488] [ 6] /usr/lib/libmpi.so.40(ompi_request_default_wait+0x1a) [0x7f28a3c8019c] [arch-nspawn-3655178:1351488] [ 7] /usr/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0x98) [0x7f28a3cf03e8] [arch-nspawn-3655178:1351488] [ 8] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_recursivedoubling+0x210) [0x7f28a3cf1a88] [arch-nspawn-3655178:1351488] [ 9] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_ring+0x3fc) [0x7f28a3cf443c] [arch-nspawn-3655178:1351488] [10] /usr/lib/libmpi.so.40(ompi_coll_tuned_allreduce_intra_dec_fixed+0x40) [0x7f28a3d15152] [arch-nspawn-3655178:1351488] [11] /usr/lib/libmpi.so.40(MPI_Allreduce+0x294) [0x7f28a3c8e584] [arch-nspawn-3655178:1351488] [12] /build/pastix/src/build/spm/src/libspm.so.1(spmUpdateComputedFields+0x16c) [0x7f28b00be484] [arch-nspawn-3655178:1351488] [13] /build/pastix/src/build/spm/src/libspm.so.1(genLaplacian+0xaa) [0x7f28b00c721e] [arch-nspawn-3655178:1351488] [14] /build/pastix/src/build/spm/src/libspm.so.1(+0x409c8) [0x7f28b00c89c8] [arch-nspawn-3655178:1351488] [15] ./simple(+0xe2c) [0x555555556e2c] [arch-nspawn-3655178:1351488] [16] /usr/lib/libc.so.6(+0x27fae) [0x7f28a3aa4fae] [arch-nspawn-3655178:1351488] [17] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7f28a3aa50b8] [arch-nspawn-3655178:1351488] [18] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:1351488] *** End of error message *** -------------------------------------------------------------------------- prte noticed that process rank 0 with PID 1351488 on node arch-nspawn-3655178 exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- Start 2356: mpi_dst_example_simple_lap_c_facto4_sched0_not_tqrcpbegin 1976/3626 Test #2387: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrcpend ..... Passed 193.54 sec 1977/3626 Test #2352: mpi_dst_example_simple_lap_c_facto4_sched0_kway_rqrcpbegin .............. Passed 200.96 sec 1978/3626 Test #2357: mpi_dst_example_simple_lap_c_facto4_sched0_not_tqrcpend ................. Passed 200.54 sec 1979/3626 Test #2362: mpi_dst_example_simple_lap_c_facto4_sched0_not_rqrrtbegin ............... Passed 199.75 sec 1980/3626 Test #2371: mpi_dst_example_simple_lap_z_facto0_sched0_not_svdend ................... Passed 199.27 sec 1981/3626 Test #2367: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_rqrrtend ..... Passed 199.61 sec 1982/3626 Test #2361: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_tqrcpend ..... Passed 199.77 sec 1983/3626 Test #2353: mpi_dst_example_simple_lap_c_facto4_sched0_kway_rqrcpend ................ Passed 200.95 sec 1984/3626 Test #2373: mpi_dst_example_simple_lap_z_facto0_sched0_kway_svdend .................. Passed 199.17 sec 1985/3626 Test #2364: mpi_dst_example_simple_lap_c_facto4_sched0_kway_rqrrtbegin .............. Passed 199.68 sec 1986/3626 Test #2378: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpbegin .............. Passed 199.09 sec 1987/3626 Test #2383: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrcpend ................. Passed 199.01 sec 1988/3626 Test #2350: mpi_dst_example_simple_lap_c_facto4_sched0_not_rqrcpbegin ............... Passed 201.00 sec 1989/3626 Test #2351: mpi_dst_example_simple_lap_c_facto4_sched0_not_rqrcpend ................. Passed 200.99 sec 1990/3626 Test #2368: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpilu0 ............... Passed 199.61 sec 1991/3626 Test #2381: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_pqrcpend ..... Passed 199.06 sec 1992/3626 Test #2363: mpi_dst_example_simple_lap_c_facto4_sched0_not_rqrrtend ................. Passed 199.75 sec 1993/3626 Test #2385: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrcpend ................ Passed 198.70 sec 1994/3626 Test #2359: mpi_dst_example_simple_lap_c_facto4_sched0_kway_tqrcpend ................ Passed 199.84 sec 1995/3626 Test #2369: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpilu1 ............... Passed 199.44 sec 1996/3626 Test #2377: mpi_dst_example_simple_lap_z_facto0_sched0_not_pqrcpend ................. Passed 199.12 sec 1997/3626 Test #2354: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_rqrcpbegin ... Passed 200.95 sec 1998/3626 Test #2232: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpbegin ...***Timeout 383.99 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.557782e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.694354e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.968885e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.049969e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.207551e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.438805e-01 s Time to initialize coeftab 3.916104e+00 s Start 2232: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpbegin 1998/3626 Test #2242: mpi_dst_example_simple_lap_c_facto1_sched0_not_svdbegin .................***Timeout 383.85 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.552031e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.492686e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.118793e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.394181e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.280390e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.027565e+00 s Time to initialize coeftab 1.251996e+00 s Start 2242: mpi_dst_example_simple_lap_c_facto1_sched0_not_svdbegin 1998/3626 Test #2244: mpi_dst_example_simple_lap_c_facto1_sched0_kway_svdbegin ................***Timeout 383.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.074984e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.485077e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.161086e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.487140e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.489681e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.065076e+00 s Time to initialize coeftab 1.154367e+00 s Start 2244: mpi_dst_example_simple_lap_c_facto1_sched0_kway_svdbegin 1998/3626 Test #2262: mpi_dst_example_simple_lap_c_facto1_sched0_kway_tqrcpbegin ..............***Timeout 383.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : 1: 300 1140 2: 200 760 3: 200 660 Ordering method is: Scotch Time to compute ordering 2.408501e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.277377e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.047613e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.605762e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.606430e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.344393e-01 s Time to initialize coeftab 8.447054e-01 s Start 2262: mpi_dst_example_simple_lap_c_facto1_sched0_kway_tqrcpbegin 1998/3626 Test #2306: mpi_dst_example_simple_lap_c_facto3_sched0_not_svdbegin .................***Timeout 359.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.301009e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.888910e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.272132e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.526499e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.887785e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.617961e+00 s Time to initialize coeftab 1.267557e+00 s Start 2306: mpi_dst_example_simple_lap_c_facto3_sched0_not_svdbegin 1998/3626 Test #2308: mpi_dst_example_simple_lap_c_facto3_sched0_kway_svdbegin ................***Timeout 355.77 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.861372e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.600774e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.117653e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.598130e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.128182e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.647619e-01 s Time to initialize coeftab 6.431736e-01 s Start 2308: mpi_dst_example_simple_lap_c_facto3_sched0_kway_svdbegin 1998/3626 Test #2310: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_svdbegin .....***Timeout 352.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.032282e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.483159e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.340652e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.582933e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.184751e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.838110e-01 s Time to initialize coeftab 5.311231e-01 s Start 2310: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_svdbegin 1998/3626 Test #2338: mpi_dst_example_simple_lap_c_facto4_sched0_not_svdbegin .................***Timeout 292.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.565890e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.818390e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.197656e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.795985e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.950150e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 7.617643e-01 s Time to initialize coeftab 7.260941e-01 s Start 2338: mpi_dst_example_simple_lap_c_facto4_sched0_not_svdbegin 1998/3626 Test #2342: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_svdbegin .....***Timeout 288.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.458424e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.014548e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.552393e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.154626e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.032439e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 9.053941e-01 s Time to initialize coeftab 6.098166e-01 s Start 2342: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_svdbegin Test #2004: mpi_dst_example_simple_lap_s_facto2_sched4_1d ...........................***Timeout 202.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 2478: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrcpbegin Start 2479: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrcpend Start 2480: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrcpbegin Start 2481: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrcpend Start 2482: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpbegin Start 2483: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpend Start 2484: mpi_dst_example_simple_lap_z_facto3_sched0_not_tqrcpbegin Start 2485: mpi_dst_example_simple_lap_z_facto3_sched0_not_tqrcpend Start 2486: mpi_dst_example_simple_lap_z_facto3_sched0_kway_tqrcpbegin Start 2487: mpi_dst_example_simple_lap_z_facto3_sched0_kway_tqrcpend Start 2488: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpbegin Start 2489: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpend Start 2490: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrrtbegin Start 2491: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrrtend Start 2492: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrrtbegin Start 2493: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrrtend Start 2494: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtbegin Start 2495: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtend Start 2496: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpilu0 Start 2497: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpilu1 Start 2498: mpi_dst_example_simple_lap_z_facto4_sched0_not_svdbegin Start 2499: mpi_dst_example_simple_lap_z_facto4_sched0_not_svdend Start 2500: mpi_dst_example_simple_lap_z_facto4_sched0_kway_svdbegin Start 2501: mpi_dst_example_simple_lap_z_facto4_sched0_kway_svdend Start 2502: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_svdbegin Start 2503: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_svdend Start 2504: mpi_dst_example_simple_lap_z_facto4_sched0_not_pqrcpbegin Start 2505: mpi_dst_example_simple_lap_z_facto4_sched0_not_pqrcpend Start 2506: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpbegin Start 2507: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpend Start 2508: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpbegin Start 2509: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpend Start 2510: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrcpbegin 1999/3626 Test #2374: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_svdbegin ..... Passed 199.49 sec 2000/3626 Test #2386: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrcpbegin ... Passed 197.82 sec 2001/3626 Test #2360: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_tqrcpbegin ... Passed 200.12 sec 2002/3626 Test #2370: mpi_dst_example_simple_lap_z_facto0_sched0_not_svdbegin ................. Passed 199.65 sec 2003/3626 Test #2372: mpi_dst_example_simple_lap_z_facto0_sched0_kway_svdbegin ................ Passed 199.55 sec Start 2511: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrcpend Start 2512: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrcpbegin Start 2513: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrcpend Start 2514: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpbegin Start 2515: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpend Test #1880: c_mpi_rep_example_simple_scotch_mm2 ..................................... Passed 80.50 sec Start 2516: mpi_dst_example_simple_lap_z_facto4_sched0_not_tqrcpbegin Test #1878: c_mpi_rep_example_simple_scotch_mm ...................................... Passed 80.64 sec Start 2517: mpi_dst_example_simple_lap_z_facto4_sched0_not_tqrcpend Test #1998: mpi_dst_example_simple_lap_z_facto1_sched1_1d ........................... Passed 83.19 sec Start 2518: mpi_dst_example_simple_lap_z_facto4_sched0_kway_tqrcpbegin Test #1941: mpi_rep_example_simple_lap_d_facto0_sched1_1d ........................... Passed 83.60 sec Start 2519: mpi_dst_example_simple_lap_z_facto4_sched0_kway_tqrcpend Test #2026: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpbegin .............. Passed 85.04 sec Start 2520: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpbegin Test #1923: mpi_rep_example_simple_lap_s_facto1_sched0_1d ........................... Passed 86.90 sec Start 2521: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpend Test #1894: c_mpi_rep_example_refinement_lap_s_refine_bicgstab_sym .................. Passed 90.97 sec Start 2522: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrrtbegin Test #2065: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrcpend ................ Passed 95.03 sec Start 2523: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrrtend Test #1917: c_mpi_rep_example_simple_mixed_lap_z_refine_gmres_her ................... Passed 96.17 sec Start 2524: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrrtbegin Test #1946: mpi_rep_example_simple_lap_c_facto2_sched1_1d ........................... Passed 96.19 sec Start 2525: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrrtend Test #1945: mpi_rep_example_simple_lap_c_facto1_sched1_1d ........................... Passed 97.29 sec Start 2526: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtbegin Test #1892: c_mpi_rep_example_refinement_lap_s_refine_cg_sym ........................ Passed 98.21 sec Start 2527: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtend Test #1953: mpi_rep_example_simple_lap_z_facto4_sched1_1d ........................... Passed 98.22 sec Start 2528: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpilu0 Test #1954: mpi_rep_example_simple_lap_s_facto0_sched4_1d ........................... Passed 105.01 sec Start 2529: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpilu1 Test #1861: c_mpi_rep_example_personal_lap_s_facto0 ................................. Passed 105.90 sec Start 2530: mpi_dst_example_simple_lap_s_facto0_sched1_not_svdbegin Test #1914: c_mpi_rep_example_simple_mixed_lap_d_refine_gmres_sym ................... Passed 108.04 sec Start 2531: mpi_dst_example_simple_lap_s_facto0_sched1_not_svdend Test #1904: c_mpi_rep_example_refinement_lap_z_refine_cg_her ........................ Passed 108.97 sec Start 2532: mpi_dst_example_simple_lap_s_facto0_sched1_kway_svdbegin Test #1887: c_mpi_rep_example_step-by-step_single_hb ................................ Passed 110.49 sec Start 2533: mpi_dst_example_simple_lap_s_facto0_sched1_kway_svdend Test #1932: mpi_rep_example_simple_lap_c_facto4_sched0_1d ........................... Passed 112.11 sec Start 2534: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_svdbegin Test #1951: mpi_rep_example_simple_lap_z_facto2_sched1_1d ........................... Passed 112.58 sec Start 2535: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_svdend Test #1864: c_mpi_rep_example_personal_lap_d_facto0 ................................. Passed 113.95 sec Start 2536: mpi_dst_example_simple_lap_s_facto0_sched1_not_pqrcpbegin Test #1988: mpi_dst_example_simple_lap_s_facto2_sched1_1d ........................... Passed 115.09 sec Start 2537: mpi_dst_example_simple_lap_s_facto0_sched1_not_pqrcpend Test #1849: c_mpi_rep_example_step-by-step_lap_d_facto1 ............................. Passed 118.35 sec Start 2538: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpbegin Test #2275: mpi_dst_example_simple_lap_c_facto2_sched0_not_svdend ................... Passed 116.44 sec Start 2539: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpend Test #2175: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtend ..... Passed 122.27 sec Start 2540: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpbegin Test #1866: c_mpi_rep_example_personal_lap_d_facto2 ................................. Passed 129.36 sec Start 2541: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpend Test #1847: c_mpi_rep_example_step-by-step_lap_s_facto2 ............................. Passed 130.12 sec Start 2542: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrcpbegin Test #2268: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrrtbegin .............. Passed 129.41 sec Start 2543: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrcpend Test #1856: c_mpi_rep_example_step-by-step_lap_z_facto0 ............................. Passed 135.34 sec Start 2544: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrcpbegin Test #2162: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpbegin ... Passed 138.64 sec Start 2545: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrcpend Test #1918: c_mpi_rep_example_simple_mixed_lap_z_refine_bicgstab_her ................ Passed 145.67 sec Start 2546: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpbegin Test #2345: mpi_dst_example_simple_lap_c_facto4_sched0_not_pqrcpend ................. Passed 146.19 sec Start 2547: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpend Test #1872: c_mpi_rep_example_personal_lap_z_facto0 ................................. Passed 165.19 sec Start 2548: mpi_dst_example_simple_lap_s_facto0_sched1_not_tqrcpbegin Test #1881: c_mpi_rep_example_simple_single_rsa ..................................... Passed 165.29 sec Start 2549: mpi_dst_example_simple_lap_s_facto0_sched1_not_tqrcpend Test #1876: c_mpi_rep_example_personal_lap_z_facto4 ................................. Passed 166.33 sec Start 2550: mpi_dst_example_simple_lap_s_facto0_sched1_kway_tqrcpbegin Test #2149: mpi_dst_example_simple_lap_d_facto1_sched0_kway_svdend .................. Passed 165.60 sec Start 2551: mpi_dst_example_simple_lap_s_facto0_sched1_kway_tqrcpend Test #1913: c_mpi_rep_example_simple_mixed_lap_d_refine_cg_sym ...................... Passed 166.87 sec Test #1929: mpi_rep_example_simple_lap_c_facto1_sched0_1d ........................... Passed 166.81 sec Start 2552: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpbegin Start 2553: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpend Test #1845: c_mpi_rep_example_step-by-step_lap_s_facto0 ............................. Passed 169.69 sec Start 2554: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrrtbegin Test #1874: c_mpi_rep_example_personal_lap_z_facto2 ................................. Passed 169.92 sec Start 2555: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrrtend Test #1846: c_mpi_rep_example_step-by-step_lap_s_facto1 ............................. Passed 174.19 sec Start 2556: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrrtbegin Test #1960: mpi_rep_example_simple_lap_c_facto0_sched4_1d ........................... Passed 173.99 sec Start 2557: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrrtend Test #1852: c_mpi_rep_example_step-by-step_lap_c_facto1 ............................. Passed 177.01 sec Start 2558: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtbegin Test #1854: c_mpi_rep_example_step-by-step_lap_c_facto3 ............................. Passed 177.33 sec Start 2559: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtend Test #1955: mpi_rep_example_simple_lap_s_facto1_sched4_1d ........................... Passed 177.48 sec Start 2560: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpilu0 Test #1857: c_mpi_rep_example_step-by-step_lap_z_facto1 ............................. Passed 178.39 sec Start 2561: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpilu1 Test #1859: c_mpi_rep_example_step-by-step_lap_z_facto3 ............................. Passed 181.58 sec Start 2562: mpi_dst_example_simple_lap_s_facto1_sched1_not_svdbegin Test #1869: c_mpi_rep_example_personal_lap_c_facto2 ................................. Passed 182.91 sec Start 2563: mpi_dst_example_simple_lap_s_facto1_sched1_not_svdend Test #1860: c_mpi_rep_example_step-by-step_lap_z_facto4 ............................. Passed 184.10 sec Start 2564: mpi_dst_example_simple_lap_s_facto1_sched1_kway_svdbegin Test #1910: c_mpi_rep_example_simple_mixed_refine_cg ................................ Passed 184.50 sec Start 2565: mpi_dst_example_simple_lap_s_facto1_sched1_kway_svdend Test #2052: mpi_dst_example_simple_lap_s_facto1_sched0_kway_svdbegin ................ Passed 185.90 sec Start 2566: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_svdbegin Test #1977: mpi_dst_example_simple_lap_c_facto1_sched0_1d ........................... Passed 189.91 sec Start 2567: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_svdend Test #1999: mpi_dst_example_simple_lap_z_facto2_sched1_1d ........................... Passed 194.17 sec Start 2568: mpi_dst_example_simple_lap_s_facto1_sched1_not_pqrcpbegin Test #2075: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrrtend ................. Passed 193.85 sec Start 2569: mpi_dst_example_simple_lap_s_facto1_sched1_not_pqrcpend Test #2104: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpbegin ... Passed 192.85 sec Start 2570: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpbegin Test #1925: mpi_rep_example_simple_lap_d_facto0_sched0_1d ........................... Passed 195.20 sec Start 2571: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpend Test #1868: c_mpi_rep_example_personal_lap_c_facto1 ................................. Passed 195.45 sec Start 2572: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpbegin Test #2156: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpbegin ... Passed 194.79 sec Start 2573: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpend Test #2314: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpbegin .............. Passed 193.76 sec Start 2574: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrcpbegin 2063/3626 Test #2437: mpi_dst_example_simple_lap_z_facto2_sched0_kway_svdend .................. Passed 193.28 sec Start 2575: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrcpend Test #2172: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrrtbegin ..............***Failed 196.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 [arch-nspawn-3655178:1606328] *** Process received signal *** [arch-nspawn-3655178:1606328] Signal: Segmentation fault (11) [arch-nspawn-3655178:1606328] Signal code: Address not mapped (1) [arch-nspawn-3655178:1606328] Failing at address: 0x7f94d4ca2860 [arch-nspawn-3655178:1606328] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7fb4792da6cc] [arch-nspawn-3655178:1606328] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7fb46b1a3a02] [arch-nspawn-3655178:1606328] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7fb46b1a4504] [arch-nspawn-3655178:1606328] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7fb46b155a7a] [arch-nspawn-3655178:1606328] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7fb46b182aa2] [arch-nspawn-3655178:1606328] [ 5] /usr/lib/libmpi.so.40(+0x7de1a) [0x7fb47827de1a] [arch-nspawn-3655178:1606328] [ 6] /usr/lib/libmpi.so.40(ompi_request_default_wait+0x1a) [0x7fb47828019c] [arch-nspawn-3655178:1606328] [ 7] /usr/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0x98) [0x7fb4782f03e8] [arch-nspawn-3655178:1606328] [ 8] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_recursivedoubling+0x210) [0x7fb4782f1a88] [arch-nspawn-3655178:1606328] [ 9] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_ring+0x3fc) [0x7fb4782f443c] [arch-nspawn-3655178:1606328] [10] /usr/lib/libmpi.so.40(ompi_coll_tuned_allreduce_intra_dec_fixed+0x40) [0x7fb478315152] [arch-nspawn-3655178:1606328] [11] /usr/lib/libmpi.so.40(MPI_Allreduce+0x294) [0x7fb47828e584] [arch-nspawn-3655178:1606328] [12] /build/pastix/src/build/spm/src/libspm.so.1(spmUpdateComputedFields+0x140) [0x7fb478540458] [arch-nspawn-3655178:1606328] [13] /build/pastix/src/build/spm/src/libspm.so.1(genLaplacian+0xaa) [0x7fb47854921e] [arch-nspawn-3655178:1606328] [14] /build/pastix/src/build/spm/src/libspm.so.1(+0x409c8) [0x7fb47854a9c8] [arch-nspawn-3655178:1606328] [15] ./simple(+0xe2c) [0x555555556e2c] [arch-nspawn-3655178:1606328] [16] /usr/lib/libc.so.6(+0x27fae) [0x7fb4780a4fae] [arch-nspawn-3655178:1606328] [17] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7fb4780a50b8] [arch-nspawn-3655178:1606328] [18] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:1606328] *** End of error message *** -------------------------------------------------------------------------- prte noticed that process rank 2 with PID 1606328 on node arch-nspawn-3655178 exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- Start 2172: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrrtbegin Test #2180: mpi_dst_example_simple_lap_d_facto2_sched0_kway_svdbegin ................ Passed 196.67 sec Start 2576: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrcpbegin Test #1972: mpi_dst_example_simple_lap_s_facto2_sched0_1d ........................... Passed 199.04 sec Start 2577: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrcpend Test #1885: c_mpi_rep_example_step-by-step_single_rsa ...............................***Timeout 235.50 sec RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.998372e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 1607873 Fill-in of L 39.664331 Time to compute symbol matrix 4.317437e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.601501e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 1607873 Fill-in 39.664331 Number of operations in full-rank: LDL^t 644.52 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.714741e-02 s Time for mapping/scheduling 9.922102e-01 s Time to initialize internal csc 6.642193e-03 s Time to initialize coeftab 6.379499e-02 s Time to factorize 3.803743e+01 s (16.94 MFlop/s) Number of operations 3.22 GFlops Number of static pivots 5000 Memory usage of coeftab 1.82 Mo Time to solve 1.284162e+01 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 2.418854e+01 s || A ||_1 2.170513e-01 max(|| b_i ||_oo) 1.032220e-01 max(|| x_i ||_oo) 7.434845e-01 || A ||_1 2.170513e-01 max(|| b_i ||_oo) 1.032220e-01 max(|| x_i ||_oo) 7.434845e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.082933e-13 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.082933e-13 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) || A ||_1 2.170513e-01 max(|| b_i ||_oo) 1.032220e-01 max(|| x_i ||_oo) 7.434845e-01 || A ||_1 2.170513e-01 max(|| b_i ||_oo) 1.032220e-01 max(|| x_i ||_oo) 7.434845e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.082933e-13 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.082933e-13 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) Time to solve 5.039413e+00 s WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time for refinement 6.447445e+00 s || A ||_1 2.170513e-01 max(|| b_i ||_oo) 1.032220e-01 max(|| x_i ||_oo) 7.434845e-01 || A ||_1 2.170513e-01 || A ||_1 2.170513e-01 max(|| b_i ||_oo) 1.032220e-01 max(|| x_i ||_oo) 7.434845e-01 max(|| b_i ||_oo) 1.032220e-01 max(|| x_i ||_oo) 7.434845e-01 || A ||_1 2.170513e-01 max(|| b_i ||_oo) 1.032220e-01 max(|| x_i ||_oo) 7.434845e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.082933e-13 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.082933e-13 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) Time to initialize internal csc 4.955436e-03 s max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.082933e-13 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.082933e-13 max(|| b_i - A x_i ||_1) 2.083110e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.340988e-01 (SUCCESS) Time to initialize coeftab 1.254570e+00 s Test #1896: c_mpi_rep_example_refinement_lap_d_refine_gmres_sym .....................***Timeout 235.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.271601e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.150052e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.442265e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.497896e+00 s Time to initialize internal csc 2.270349e-03 s - iteration 1 : total iteration time 0.422 s error 0.20451 - iteration 2 : total iteration time 0.278 s error 0.05944 - iteration 3 : total iteration time 0.419 s error 0.019007 - iteration 4 : total iteration time 0.696 s error 0.0066596 - iteration 5 : total iteration time 0.513 s error 0.0023054 - iteration 6 : total iteration time 0.77 s error 0.00077935 - iteration 7 : total iteration time 0.598 s error 0.00027759 - iteration 8 : total iteration time 1.15 s error 9.3504e-05 - iteration 9 : total iteration time 1.39 s error 3.0631e-05 - iteration 10 : total iteration time 1.44 s error 1.0017e-05 - iteration 11 : total iteration time 4.62 s error 3.0969e-06 - iteration 12 : total iteration time 2.06 s error 9.333e-07 - iteration 13 : total iteration time 4.65 s error 2.7791e-07 - iteration 14 : total iteration time 8.78 s error 8.2065e-08 - iteration 15 : total iteration time 9.51 s error 2.3931e-08 - iteration 16 : total iteration time 11.2 s error 6.7596e-09 - iteration 17 : total iteration time 25.3 s error 1.8866e-09 - iteration 18 : total iteration time 10.3 s error 5.4042e-10 - iteration 19 : total iteration time 4.42 s error 1.7791e-10 - iteration 20 : total iteration time 4.35 s error 6.5041e-11 - iteration 21 : total iteration time 8.26 s error 2.4325e-11 - iteration 22 : total iteration time 10.4 s error 7.3601e-12 - iteration 23 : total iteration time 5.65 s error 2.1037e-12 Test #1905: c_mpi_rep_example_refinement_lap_z_refine_gmres_her .....................***Timeout 235.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 11476 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.531864e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 84938 Fill-in of L 7.401359 Time to compute symbol matrix 6.930664e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.770317e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 169876 Fill-in 14.802719 Number of operations in full-rank: LU 62.91 MFlops Prediction: Model AMD 6180 MKL Time to factorize 2.138097e-03 s Time for mapping/scheduling 6.077305e-01 s Time to initialize internal csc 5.950585e-03 s - iteration 1 : total iteration time 0.495 s error 0.22315 - iteration 2 : total iteration time 0.232 s error 0.071346 - iteration 3 : total iteration time 0.462 s error 0.027924 - iteration 4 : total iteration time 0.492 s error 0.0091303 - iteration 5 : total iteration time 0.829 s error 0.0041632 - iteration 6 : total iteration time 0.758 s error 0.0019584 - iteration 7 : total iteration time 1.19 s error 0.0006969 - iteration 8 : total iteration time 1.02 s error 0.00021478 - iteration 9 : total iteration time 0.945 s error 9.1664e-05 - iteration 10 : total iteration time 4.78 s error 3.5384e-05 - iteration 11 : total iteration time 1.8 s error 1.5972e-05 - iteration 12 : total iteration time 10.2 s error 6.3906e-06 - iteration 13 : total iteration time 6.16 s error 1.9635e-06 - iteration 14 : total iteration time 6.86 s error 7.3819e-07 - iteration 15 : total iteration time 10.2 s error 2.9725e-07 - iteration 16 : total iteration time 26.3 s error 1.21e-07 - iteration 17 : total iteration time 8.8 s error 5.1897e-08 - iteration 18 : total iteration time 4.34 s error 1.9171e-08 - iteration 19 : total iteration time 4.32 s error 6.9474e-09 - iteration 20 : total iteration time 7.58 s error 2.7844e-09 - iteration 21 : total iteration time 10.7 s error 1.2383e-09 - iteration 22 : total iteration time 6.29 s error 5.9112e-10 Test #1906: c_mpi_rep_example_refinement_lap_z_refine_bicgstab_her ..................***Timeout 235.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Test #1908: c_mpi_rep_example_refinement_lap_z_refine_gmres_sym .....................***Timeout 235.46 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Test #1909: c_mpi_rep_example_refinement_lap_z_refine_bicgstab_sym ..................***Timeout 235.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.650494e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.299428e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.378435e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.777181e-01 s Time to initialize internal csc 5.355111e-03 s Test #1916: c_mpi_rep_example_simple_mixed_lap_z_refine_cg_her ......................***Timeout 235.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #1926: mpi_rep_example_simple_lap_d_facto1_sched0_1d ...........................***Timeout 235.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #1928: mpi_rep_example_simple_lap_c_facto0_sched0_1d ...........................***Timeout 235.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #1940: mpi_rep_example_simple_lap_s_facto2_sched1_1d ...........................***Timeout 235.36 sec Test #1958: mpi_rep_example_simple_lap_d_facto1_sched4_1d ...........................***Timeout 235.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #1962: mpi_rep_example_simple_lap_c_facto2_sched4_1d ...........................***Timeout 235.29 sec ischedInit: The thread number has been automatically set to 256 Test #1963: mpi_rep_example_simple_lap_c_facto3_sched4_1d ...........................***Timeout 235.28 sec Test #1966: mpi_rep_example_simple_lap_z_facto1_sched4_1d ...........................***Timeout 235.27 sec ischedInit: The thread number has been automatically set to 256 Test #1967: mpi_rep_example_simple_lap_z_facto2_sched4_1d ...........................***Timeout 235.27 sec Test #1968: mpi_rep_example_simple_lap_z_facto3_sched4_1d ...........................***Timeout 235.26 sec Test #1969: mpi_rep_example_simple_lap_z_facto4_sched4_1d ...........................***Timeout 235.25 sec Test #1970: mpi_dst_example_simple_lap_s_facto0_sched0_1d ...........................***Timeout 235.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Test #1973: mpi_dst_example_simple_lap_d_facto0_sched0_1d ...........................***Timeout 235.23 sec Test #1978: mpi_dst_example_simple_lap_c_facto2_sched0_1d ...........................***Timeout 235.22 sec Test #1984: mpi_dst_example_simple_lap_z_facto3_sched0_1d ...........................***Timeout 235.22 sec Test #1985: mpi_dst_example_simple_lap_z_facto4_sched0_1d ...........................***Timeout 235.21 sec ischedInit: The thread number has been automatically set to 256 Test #1987: mpi_dst_example_simple_lap_s_facto1_sched1_1d ...........................***Timeout 235.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Test #1989: mpi_dst_example_simple_lap_d_facto0_sched1_1d ...........................***Timeout 235.19 sec Test #1990: mpi_dst_example_simple_lap_d_facto1_sched1_1d ...........................***Timeout 235.19 sec ischedInit: The thread number has been automatically set to 256 Test #1994: mpi_dst_example_simple_lap_c_facto2_sched1_1d ...........................***Timeout 235.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Test #1995: mpi_dst_example_simple_lap_c_facto3_sched1_1d ...........................***Timeout 235.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Test #1996: mpi_dst_example_simple_lap_c_facto4_sched1_1d ...........................***Timeout 235.17 sec ischedInit: The thread number has been automatically set to 256 Test #1997: mpi_dst_example_simple_lap_z_facto0_sched1_1d ...........................***Timeout 235.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2001: mpi_dst_example_simple_lap_z_facto4_sched1_1d ...........................***Timeout 235.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 2096/3626 Test #2426: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrrtbegin ............... Passed 232.68 sec Test #2002: mpi_dst_example_simple_lap_s_facto0_sched4_1d ...........................***Timeout 235.14 sec Test #2027: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpend ................***Timeout 243.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.229476e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.830767e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.102110e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.209162e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.036026e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.311446e-01 s Time to initialize coeftab 1.081514e-01 s Time to factorize 1.903013e+00 s ( 2.66 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 6.172720e-01 s Time for refinement 4.760458e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098174e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098174e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098174e-07 max(|| b_i - A x_i ||_1) 9.596137e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205841e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.098174e-07 max(|| b_i - A x_i ||_1) 9.596137e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205841e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.596137e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205841e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.596137e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.205841e+00 (SUCCESS) Test #2173: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrrtend ................***Timeout 328.20 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.112313e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.493399e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.773342e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.644253e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.002753e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.162423e-01 s Time to initialize coeftab 8.706321e-02 s Time to factorize 3.208523e+00 s ( 1.63 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.990054e-01 s - iteration 1 : total iteration time 0.969 s error 2.8928e-15 Time for refinement 2.220096e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897596e-15 max(|| b_i - A x_i ||_1) 3.012481e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785442e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897596e-15 max(|| b_i - A x_i ||_1) 3.012481e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785442e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897596e-15 max(|| b_i - A x_i ||_1) 3.012481e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785442e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.897596e-15 max(|| b_i - A x_i ||_1) 3.012481e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785442e-03 (SUCCESS) Start 2173: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrrtend Test #2193: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrcpend ................***Timeout 357.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.712787e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.604711e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.808575e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.412364e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.199599e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.613408e-01 s Time to initialize coeftab 1.770775e-01 s Time to factorize 5.379775e+00 s ( 1.86 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 9.370424e-01 s - iteration 1 : total iteration time 1.55 s error 2.8321e-16 Time for refinement 2.553864e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028938e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028938e-16 max(|| b_i - A x_i ||_1) 7.593883e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.542367e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.593883e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.542367e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028938e-16 max(|| b_i - A x_i ||_1) 7.593883e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.542367e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.028938e-16 max(|| b_i - A x_i ||_1) 7.593883e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.542367e-04 (SUCCESS) Start 2193: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrcpend Test #2219: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpend ................***Timeout 381.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.892974e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.001098e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.627113e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.038856e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.611080e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.890460e-01 s Time to initialize coeftab 5.450331e-01 s Time to factorize 2.179558e+00 s ( 9.31 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.102967e-01 s Time for refinement 5.187517e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.584095e-07 max(|| b_i - A x_i ||_1) 9.330522e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.354415e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.584095e-07 max(|| b_i - A x_i ||_1) 9.330522e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.354415e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.584095e-07 max(|| b_i - A x_i ||_1) 9.330522e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.354415e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.584095e-07 max(|| b_i - A x_i ||_1) 9.330522e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.354415e+00 (SUCCESS) Start 2219: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpend Test #2223: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrcpend .................***Timeout 384.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.454192e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.809981e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.116815e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.137600e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.961982e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.080232e-01 s Time to initialize coeftab 1.397379e-01 s Time to factorize 7.598634e+00 s ( 2.67 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.400108e-01 s Time for refinement 4.491331e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.589400e-07 max(|| b_i - A x_i ||_1) 9.341289e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.357132e+00 (SUCCESS) Start 2223: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrcpend Test #2296: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpbegin ...***Timeout 410.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.755974e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.319401e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.960571e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.571025e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.767286e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.859244e+00 s Time to initialize coeftab 1.187997e+00 s Time to factorize 2.364431e+01 s ( 1.69 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 5.672381e-01 s - iteration 1 : total iteration time 0.683 s error 7.3917e-11 Time for refinement 1.289234e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.577614e-08 max(|| b_i - A x_i ||_1) 3.281999e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.281624e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.577614e-08 max(|| b_i - A x_i ||_1) 3.281999e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.281624e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.577614e-08 max(|| b_i - A x_i ||_1) 3.281999e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.281624e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.577614e-08 max(|| b_i - A x_i ||_1) 3.281999e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.281624e-01 (SUCCESS) Start 2296: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpbegin 2099/3626 Test #2499: mpi_dst_example_simple_lap_z_facto4_sched0_not_svdend ...................***Timeout 412.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.309029e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.291160e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.493946e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.493398e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.025389e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.246161e-01 s Time to initialize coeftab 7.541165e-01 s Time to factorize 2.042917e+00 s (10.43 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.020934e-01 s - iteration 1 : total iteration time 0.287 s error 3.1668e-16 Time for refinement 5.491216e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.429057e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.429057e-16 max(|| b_i - A x_i ||_1) 8.353941e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.107983e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.353941e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.107983e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.429057e-16 max(|| b_i - A x_i ||_1) 8.353941e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.107983e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.429057e-16 max(|| b_i - A x_i ||_1) 8.353941e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.107983e-03 (SUCCESS) Start 2499: mpi_dst_example_simple_lap_z_facto4_sched0_not_svdend Start 2578: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpbegin Start 2579: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpend Start 2580: mpi_dst_example_simple_lap_s_facto1_sched1_not_tqrcpbegin Start 2581: mpi_dst_example_simple_lap_s_facto1_sched1_not_tqrcpend Start 2582: mpi_dst_example_simple_lap_s_facto1_sched1_kway_tqrcpbegin Start 2583: mpi_dst_example_simple_lap_s_facto1_sched1_kway_tqrcpend Start 2584: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpbegin Start 2585: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpend Start 2586: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrrtbegin Start 2587: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrrtend Start 2588: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrrtbegin Start 2589: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrrtend Start 2590: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtbegin Start 2591: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpilu0 Start 2592: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpilu1 Start 2593: mpi_dst_example_simple_lap_s_facto2_sched1_not_svdbegin Start 2594: mpi_dst_example_simple_lap_s_facto2_sched1_not_svdend Start 2595: mpi_dst_example_simple_lap_s_facto2_sched1_kway_svdbegin Start 2596: mpi_dst_example_simple_lap_s_facto2_sched1_kway_svdend Start 2597: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_svdbegin Start 2598: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_svdend Start 2599: mpi_dst_example_simple_lap_s_facto2_sched1_not_pqrcpbegin Start 2600: mpi_dst_example_simple_lap_s_facto2_sched1_not_pqrcpend Start 2601: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpbegin Start 2602: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpend Start 2603: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpbegin Start 2604: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpend Start 2605: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrcpbegin Start 2606: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrcpend Start 2607: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrcpbegin Start 2608: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrcpend Start 2609: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpbegin Start 2610: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpend Test #2005: mpi_dst_example_simple_lap_d_facto0_sched4_1d ...........................***Timeout 415.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Test #2007: mpi_dst_example_simple_lap_d_facto2_sched4_1d ...........................***Timeout 415.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Test #2008: mpi_dst_example_simple_lap_c_facto0_sched4_1d ...........................***Timeout 415.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Test #2009: mpi_dst_example_simple_lap_c_facto1_sched4_1d ...........................***Timeout 415.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Test #2010: mpi_dst_example_simple_lap_c_facto2_sched4_1d ...........................***Timeout 415.66 sec ischedInit: The thread number has been automatically set to 256 Test #2014: mpi_dst_example_simple_lap_z_facto1_sched4_1d ...........................***Timeout 415.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 1D( 0) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 Test #2019: mpi_dst_example_simple_lap_s_facto0_sched0_not_svdend ...................***Timeout 415.65 sec ischedInit: The thread number has been automatically set to 256 Test #2020: mpi_dst_example_simple_lap_s_facto0_sched0_kway_svdbegin ................***Timeout 415.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2022: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_svdbegin .....***Timeout 415.64 sec ischedInit: The thread number has been automatically set to 256 Test #2025: mpi_dst_example_simple_lap_s_facto0_sched0_not_pqrcpend .................***Timeout 415.63 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2028: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_pqrcpbegin ...***Timeout 415.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2030: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrcpbegin ...............***Timeout 415.61 sec ischedInit: The thread number has been automatically set to 256 Test #2032: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrcpbegin ..............***Timeout 415.60 sec Test #2033: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrcpend ................***Timeout 415.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2034: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpbegin ...***Timeout 415.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2035: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpend .....***Timeout 415.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2036: mpi_dst_example_simple_lap_s_facto0_sched0_not_tqrcpbegin ...............***Timeout 415.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2038: mpi_dst_example_simple_lap_s_facto0_sched0_kway_tqrcpbegin ..............***Timeout 415.57 sec Test #2039: mpi_dst_example_simple_lap_s_facto0_sched0_kway_tqrcpend ................***Timeout 415.56 sec Test #2040: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpbegin ...***Timeout 415.56 sec Test #2041: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpend .....***Timeout 415.55 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2043: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrrtend .................***Timeout 415.54 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2046: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtbegin ...***Timeout 415.53 sec Test #2047: mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtend .....***Timeout 415.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2048: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpilu0 ...............***Timeout 415.52 sec ischedInit: The thread number has been automatically set to 256 Test #2049: mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpilu1 ...............***Timeout 415.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2050: mpi_dst_example_simple_lap_s_facto1_sched0_not_svdbegin .................***Timeout 415.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2051: mpi_dst_example_simple_lap_s_facto1_sched0_not_svdend ...................***Timeout 415.50 sec ischedInit: The thread number has been automatically set to 256 Test #2053: mpi_dst_example_simple_lap_s_facto1_sched0_kway_svdend ..................***Timeout 415.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2054: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_svdbegin .....***Timeout 415.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2056: mpi_dst_example_simple_lap_s_facto1_sched0_not_pqrcpbegin ...............***Timeout 415.48 sec Test #2057: mpi_dst_example_simple_lap_s_facto1_sched0_not_pqrcpend .................***Timeout 415.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2058: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpbegin ..............***Timeout 415.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2060: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpbegin ...***Timeout 415.46 sec Test #2063: mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrcpend .................***Timeout 415.45 sec ischedInit: The thread number has been automatically set to 256 Test #2064: mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrcpbegin ..............***Timeout 415.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2066: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpbegin ...***Timeout 415.43 sec Test #2068: mpi_dst_example_simple_lap_s_facto1_sched0_not_tqrcpbegin ...............***Timeout 415.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2069: mpi_dst_example_simple_lap_s_facto1_sched0_not_tqrcpend .................***Timeout 415.40 sec Test #2071: mpi_dst_example_simple_lap_s_facto1_sched0_kway_tqrcpend ................***Timeout 415.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2072: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpbegin ...***Timeout 415.39 sec ischedInit: The thread number has been automatically set to 256 Test #2082: mpi_dst_example_simple_lap_s_facto2_sched0_not_svdbegin .................***Timeout 415.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2087: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_svdend .......***Timeout 415.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2088: mpi_dst_example_simple_lap_s_facto2_sched0_not_pqrcpbegin ...............***Timeout 415.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2089: mpi_dst_example_simple_lap_s_facto2_sched0_not_pqrcpend .................***Timeout 415.36 sec Test #2091: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpend ................***Timeout 415.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2093: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_pqrcpend .....***Timeout 415.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2095: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrcpend .................***Timeout 415.34 sec Test #2135: mpi_dst_example_simple_lap_d_facto0_sched0_kway_tqrcpend ................***Timeout 415.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2135: mpi_dst_example_simple_lap_d_facto0_sched0_kway_tqrcpend Test #2136: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpbegin ...***Timeout 415.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2136: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpbegin Test #2137: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpend .....***Timeout 415.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2137: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpend Test #2138: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrrtbegin ...............***Timeout 415.33 sec ischedInit: The thread number has been automatically set to 256 Start 2138: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrrtbegin Test #2139: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrrtend .................***Timeout 415.33 sec Start 2139: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrrtend Test #2140: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrrtbegin ..............***Timeout 415.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2140: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrrtbegin Test #2141: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrrtend ................***Timeout 415.33 sec ischedInit: The thread number has been automatically set to 256 Start 2141: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrrtend Test #2142: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtbegin ...***Timeout 415.33 sec Start 2142: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtbegin Test #2143: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtend .....***Timeout 415.32 sec ischedInit: The thread number has been automatically set to 256 Start 2143: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtend Test #2144: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpilu0 ...............***Timeout 415.32 sec ischedInit: The thread number has been automatically set to 256 Start 2144: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpilu0 Test #2145: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpilu1 ...............***Timeout 415.32 sec Start 2145: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpilu1 Test #2146: mpi_dst_example_simple_lap_d_facto1_sched0_not_svdbegin .................***Timeout 415.31 sec ischedInit: The thread number has been automatically set to 256 Start 2146: mpi_dst_example_simple_lap_d_facto1_sched0_not_svdbegin Test #2147: mpi_dst_example_simple_lap_d_facto1_sched0_not_svdend ...................***Timeout 415.31 sec ischedInit: The thread number has been automatically set to 256 Start 2147: mpi_dst_example_simple_lap_d_facto1_sched0_not_svdend Test #2148: mpi_dst_example_simple_lap_d_facto1_sched0_kway_svdbegin ................***Timeout 415.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2148: mpi_dst_example_simple_lap_d_facto1_sched0_kway_svdbegin Test #2150: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_svdbegin .....***Timeout 415.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2150: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_svdbegin Test #2151: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_svdend .......***Timeout 415.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2151: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_svdend Test #2152: mpi_dst_example_simple_lap_d_facto1_sched0_not_pqrcpbegin ...............***Timeout 415.30 sec Start 2152: mpi_dst_example_simple_lap_d_facto1_sched0_not_pqrcpbegin Test #2153: mpi_dst_example_simple_lap_d_facto1_sched0_not_pqrcpend .................***Timeout 415.29 sec Start 2153: mpi_dst_example_simple_lap_d_facto1_sched0_not_pqrcpend Test #2155: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpend ................***Timeout 415.29 sec ischedInit: The thread number has been automatically set to 256 Start 2155: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpend Test #2157: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpend .....***Timeout 415.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2157: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpend Test #2158: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrcpbegin ...............***Timeout 415.28 sec Start 2158: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrcpbegin Test #2159: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrcpend .................***Timeout 415.28 sec Start 2159: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrcpend Test #2160: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrcpbegin ..............***Timeout 415.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2160: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrcpbegin Test #2161: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrcpend ................***Timeout 415.28 sec Start 2161: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrcpend Test #2163: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpend .....***Timeout 415.27 sec Start 2163: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpend Test #2164: mpi_dst_example_simple_lap_d_facto1_sched0_not_tqrcpbegin ...............***Timeout 415.27 sec ischedInit: The thread number has been automatically set to 256 Start 2164: mpi_dst_example_simple_lap_d_facto1_sched0_not_tqrcpbegin Test #2165: mpi_dst_example_simple_lap_d_facto1_sched0_not_tqrcpend .................***Timeout 415.27 sec ischedInit: The thread number has been automatically set to 256 Start 2165: mpi_dst_example_simple_lap_d_facto1_sched0_not_tqrcpend Test #2166: mpi_dst_example_simple_lap_d_facto1_sched0_kway_tqrcpbegin ..............***Timeout 415.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2166: mpi_dst_example_simple_lap_d_facto1_sched0_kway_tqrcpbegin Test #2167: mpi_dst_example_simple_lap_d_facto1_sched0_kway_tqrcpend ................***Timeout 415.27 sec Start 2167: mpi_dst_example_simple_lap_d_facto1_sched0_kway_tqrcpend Test #2168: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpbegin ...***Timeout 415.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2168: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpbegin Test #2169: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpend .....***Timeout 415.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2169: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpend Test #2171: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrrtend .................***Timeout 415.26 sec Start 2171: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrrtend Test #2174: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtbegin ...***Timeout 415.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2174: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtbegin Test #2006: mpi_dst_example_simple_lap_d_facto1_sched4_1d ...........................***Timeout 415.24 sec ischedInit: The thread number has been automatically set to 256 Test #2037: mpi_dst_example_simple_lap_s_facto0_sched0_not_tqrcpend .................***Timeout 415.23 sec Test #2042: mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrrtbegin ...............***Timeout 415.23 sec Test #2044: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrrtbegin ..............***Timeout 415.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2045: mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrrtend ................***Timeout 415.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2055: mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_svdend .......***Timeout 415.21 sec Test #2059: mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpend ................***Timeout 415.20 sec ischedInit: The thread number has been automatically set to 256 Test #2070: mpi_dst_example_simple_lap_s_facto1_sched0_kway_tqrcpbegin ..............***Timeout 415.20 sec Test #2090: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpbegin ..............***Timeout 415.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2094: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrcpbegin ...............***Timeout 415.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2176: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpilu0 ...............***Timeout 415.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2176: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpilu0 Test #2177: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpilu1 ...............***Timeout 415.18 sec Start 2177: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpilu1 Test #2178: mpi_dst_example_simple_lap_d_facto2_sched0_not_svdbegin .................***Timeout 415.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.098066e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.236498e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.125925e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.121137e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.168607e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.211137e-01 s Time to initialize coeftab 1.201740e+00 s Start 2178: mpi_dst_example_simple_lap_d_facto2_sched0_not_svdbegin Test #2179: mpi_dst_example_simple_lap_d_facto2_sched0_not_svdend ...................***Timeout 415.18 sec ischedInit: The thread number has been automatically set to 256 Start 2179: mpi_dst_example_simple_lap_d_facto2_sched0_not_svdend Test #2182: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_svdbegin .....***Timeout 415.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2182: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_svdbegin Test #2183: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_svdend .......***Timeout 415.17 sec Start 2183: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_svdend Test #2184: mpi_dst_example_simple_lap_d_facto2_sched0_not_pqrcpbegin ...............***Timeout 415.17 sec ischedInit: The thread number has been automatically set to 256 Start 2184: mpi_dst_example_simple_lap_d_facto2_sched0_not_pqrcpbegin Test #2185: mpi_dst_example_simple_lap_d_facto2_sched0_not_pqrcpend .................***Timeout 415.17 sec Start 2185: mpi_dst_example_simple_lap_d_facto2_sched0_not_pqrcpend Test #2186: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpbegin ..............***Timeout 415.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2186: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpbegin Test #2187: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpend ................***Timeout 415.17 sec ischedInit: The thread number has been automatically set to 256 Start 2187: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpend Test #2188: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpbegin ...***Timeout 415.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2188: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpbegin Test #2189: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpend .....***Timeout 415.16 sec ischedInit: The thread number has been automatically set to 256 Start 2189: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpend Test #2190: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrcpbegin ...............***Timeout 415.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2190: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrcpbegin Test #2191: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrcpend .................***Timeout 415.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2191: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrcpend Test #2192: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrcpbegin ..............***Timeout 415.15 sec Start 2192: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrcpbegin Test #2194: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpbegin ...***Timeout 415.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2194: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpbegin Test #2195: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpend .....***Timeout 415.15 sec Start 2195: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpend Test #2196: mpi_dst_example_simple_lap_d_facto2_sched0_not_tqrcpbegin ...............***Timeout 415.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2196: mpi_dst_example_simple_lap_d_facto2_sched0_not_tqrcpbegin Test #2197: mpi_dst_example_simple_lap_d_facto2_sched0_not_tqrcpend .................***Timeout 415.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2197: mpi_dst_example_simple_lap_d_facto2_sched0_not_tqrcpend Test #2198: mpi_dst_example_simple_lap_d_facto2_sched0_kway_tqrcpbegin ..............***Timeout 415.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.190198e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.479594e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.683274e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 2198: mpi_dst_example_simple_lap_d_facto2_sched0_kway_tqrcpbegin Test #2199: mpi_dst_example_simple_lap_d_facto2_sched0_kway_tqrcpend ................***Timeout 415.14 sec Start 2199: mpi_dst_example_simple_lap_d_facto2_sched0_kway_tqrcpend Test #2200: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpbegin ...***Timeout 415.14 sec Start 2200: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpbegin Test #2201: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpend .....***Timeout 415.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2201: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpend Test #2202: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrrtbegin ...............***Timeout 415.14 sec Start 2202: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrrtbegin Test #2203: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrrtend .................***Timeout 415.14 sec ischedInit: The thread number has been automatically set to 256 Start 2203: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrrtend Test #2204: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrrtbegin ..............***Timeout 415.14 sec Start 2204: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrrtbegin Test #2205: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrrtend ................***Timeout 415.14 sec ischedInit: The thread number has been automatically set to 256 Start 2205: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrrtend Test #2206: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtbegin ...***Timeout 415.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2206: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtbegin Test #2207: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtend .....***Timeout 415.13 sec Start 2207: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtend Test #2208: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpilu0 ...............***Timeout 415.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2208: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpilu0 Test #2209: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpilu1 ...............***Timeout 415.13 sec ischedInit: The thread number has been automatically set to 256 Start 2209: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpilu1 Test #2210: mpi_dst_example_simple_lap_c_facto0_sched0_not_svdbegin .................***Timeout 415.13 sec Start 2210: mpi_dst_example_simple_lap_c_facto0_sched0_not_svdbegin Test #2211: mpi_dst_example_simple_lap_c_facto0_sched0_not_svdend ...................***Timeout 415.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2211: mpi_dst_example_simple_lap_c_facto0_sched0_not_svdend Test #2212: mpi_dst_example_simple_lap_c_facto0_sched0_kway_svdbegin ................***Timeout 415.13 sec Start 2212: mpi_dst_example_simple_lap_c_facto0_sched0_kway_svdbegin Test #2213: mpi_dst_example_simple_lap_c_facto0_sched0_kway_svdend ..................***Timeout 415.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2213: mpi_dst_example_simple_lap_c_facto0_sched0_kway_svdend Test #2214: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_svdbegin .....***Timeout 415.13 sec Start 2214: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_svdbegin Test #2215: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_svdend .......***Timeout 415.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2215: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_svdend Test #2216: mpi_dst_example_simple_lap_c_facto0_sched0_not_pqrcpbegin ...............***Timeout 415.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2216: mpi_dst_example_simple_lap_c_facto0_sched0_not_pqrcpbegin Test #2217: mpi_dst_example_simple_lap_c_facto0_sched0_not_pqrcpend .................***Timeout 415.12 sec ischedInit: The thread number has been automatically set to 256 Start 2217: mpi_dst_example_simple_lap_c_facto0_sched0_not_pqrcpend Test #2218: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpbegin ..............***Timeout 415.12 sec ischedInit: The thread number has been automatically set to 256 Start 2218: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpbegin Test #2220: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpbegin ...***Timeout 415.12 sec Start 2220: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpbegin Test #2221: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpend .....***Timeout 415.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2221: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpend Test #2222: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrcpbegin ...............***Timeout 415.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2222: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrcpbegin Test #2224: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrcpbegin ..............***Timeout 415.11 sec Start 2224: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrcpbegin Test #2225: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrcpend ................***Timeout 415.11 sec ischedInit: The thread number has been automatically set to 256 Start 2225: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrcpend Test #2226: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpbegin ...***Timeout 415.10 sec ischedInit: The thread number has been automatically set to 256 Start 2226: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpbegin Test #2227: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpend .....***Timeout 415.10 sec ischedInit: The thread number has been automatically set to 256 Start 2227: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpend Test #2228: mpi_dst_example_simple_lap_c_facto0_sched0_not_tqrcpbegin ...............***Timeout 415.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2228: mpi_dst_example_simple_lap_c_facto0_sched0_not_tqrcpbegin Test #2229: mpi_dst_example_simple_lap_c_facto0_sched0_not_tqrcpend .................***Timeout 415.10 sec Start 2229: mpi_dst_example_simple_lap_c_facto0_sched0_not_tqrcpend Test #2230: mpi_dst_example_simple_lap_c_facto0_sched0_kway_tqrcpbegin ..............***Timeout 415.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2230: mpi_dst_example_simple_lap_c_facto0_sched0_kway_tqrcpbegin Test #2231: mpi_dst_example_simple_lap_c_facto0_sched0_kway_tqrcpend ................***Timeout 415.10 sec Start 2231: mpi_dst_example_simple_lap_c_facto0_sched0_kway_tqrcpend Test #2233: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpend .....***Timeout 415.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2233: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpend Test #2234: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrrtbegin ...............***Timeout 415.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2234: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrrtbegin Test #2235: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrrtend .................***Timeout 415.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2235: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrrtend Test #2236: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrrtbegin ..............***Timeout 415.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2236: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrrtbegin Test #2238: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtbegin ...***Timeout 415.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2238: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtbegin Test #2239: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtend .....***Timeout 415.08 sec Start 2239: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtend Test #2240: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpilu0 ...............***Timeout 415.08 sec ischedInit: The thread number has been automatically set to 256 Start 2240: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpilu0 Test #2241: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpilu1 ...............***Timeout 415.08 sec ischedInit: The thread number has been automatically set to 256 Start 2241: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpilu1 Test #2243: mpi_dst_example_simple_lap_c_facto1_sched0_not_svdend ...................***Timeout 415.07 sec ischedInit: The thread number has been automatically set to 256 Start 2243: mpi_dst_example_simple_lap_c_facto1_sched0_not_svdend Test #2245: mpi_dst_example_simple_lap_c_facto1_sched0_kway_svdend ..................***Timeout 415.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2245: mpi_dst_example_simple_lap_c_facto1_sched0_kway_svdend Test #2246: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_svdbegin .....***Timeout 415.06 sec Start 2246: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_svdbegin Test #2247: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_svdend .......***Timeout 415.06 sec Start 2247: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_svdend Test #2248: mpi_dst_example_simple_lap_c_facto1_sched0_not_pqrcpbegin ...............***Timeout 415.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2248: mpi_dst_example_simple_lap_c_facto1_sched0_not_pqrcpbegin Test #2249: mpi_dst_example_simple_lap_c_facto1_sched0_not_pqrcpend .................***Timeout 415.06 sec ischedInit: The thread number has been automatically set to 256 Start 2249: mpi_dst_example_simple_lap_c_facto1_sched0_not_pqrcpend Test #2250: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpbegin ..............***Timeout 415.05 sec Start 2250: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpbegin Test #2251: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpend ................***Timeout 415.05 sec ischedInit: The thread number has been automatically set to 256 Start 2251: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpend Test #2252: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpbegin ...***Timeout 415.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2252: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpbegin Test #2253: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpend .....***Timeout 415.05 sec Start 2253: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpend Test #2254: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrcpbegin ...............***Timeout 415.04 sec ischedInit: The thread number has been automatically set to 256 Start 2254: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrcpbegin Test #2255: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrcpend .................***Timeout 415.04 sec Start 2255: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrcpend Test #2256: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrcpbegin ..............***Timeout 415.04 sec ischedInit: The thread number has been automatically set to 256 Start 2256: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrcpbegin Test #2257: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrcpend ................***Timeout 415.04 sec ischedInit: The thread number has been automatically set to 256 Start 2257: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrcpend Test #2258: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpbegin ...***Timeout 415.03 sec Start 2258: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpbegin Test #2259: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpend .....***Timeout 415.03 sec Start 2259: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpend Test #2260: mpi_dst_example_simple_lap_c_facto1_sched0_not_tqrcpbegin ...............***Timeout 415.03 sec ischedInit: The thread number has been automatically set to 256 Start 2260: mpi_dst_example_simple_lap_c_facto1_sched0_not_tqrcpbegin Test #2261: mpi_dst_example_simple_lap_c_facto1_sched0_not_tqrcpend .................***Timeout 415.03 sec Start 2261: mpi_dst_example_simple_lap_c_facto1_sched0_not_tqrcpend Test #2263: mpi_dst_example_simple_lap_c_facto1_sched0_kway_tqrcpend ................***Timeout 415.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2263: mpi_dst_example_simple_lap_c_facto1_sched0_kway_tqrcpend Test #2264: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpbegin ...***Timeout 415.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2264: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpbegin Test #2265: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpend .....***Timeout 415.01 sec Start 2265: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpend Test #2266: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrrtbegin ...............***Timeout 415.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2266: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrrtbegin Test #2267: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrrtend .................***Timeout 415.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2267: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrrtend Test #2269: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrrtend ................***Timeout 415.00 sec ischedInit: The thread number has been automatically set to 256 Start 2269: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrrtend Test #2270: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtbegin ...***Timeout 415.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2270: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtbegin Test #2271: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtend .....***Timeout 414.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2271: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtend Test #2272: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpilu0 ...............***Timeout 414.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2272: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpilu0 Test #2273: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpilu1 ...............***Timeout 414.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 Start 2273: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpilu1 Test #2274: mpi_dst_example_simple_lap_c_facto2_sched0_not_svdbegin .................***Timeout 414.99 sec Start 2274: mpi_dst_example_simple_lap_c_facto2_sched0_not_svdbegin Test #2276: mpi_dst_example_simple_lap_c_facto2_sched0_kway_svdbegin ................***Timeout 414.98 sec ischedInit: The thread number has been automatically set to 256 Start 2276: mpi_dst_example_simple_lap_c_facto2_sched0_kway_svdbegin Test #2277: mpi_dst_example_simple_lap_c_facto2_sched0_kway_svdend ..................***Timeout 414.98 sec Start 2277: mpi_dst_example_simple_lap_c_facto2_sched0_kway_svdend Test #2278: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_svdbegin .....***Timeout 414.98 sec Start 2278: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_svdbegin Test #2279: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_svdend .......***Timeout 414.98 sec ischedInit: The thread number has been automatically set to 256 Start 2279: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_svdend Test #2280: mpi_dst_example_simple_lap_c_facto2_sched0_not_pqrcpbegin ...............***Timeout 414.98 sec Start 2280: mpi_dst_example_simple_lap_c_facto2_sched0_not_pqrcpbegin Test #2281: mpi_dst_example_simple_lap_c_facto2_sched0_not_pqrcpend .................***Timeout 414.98 sec ischedInit: The thread number has been automatically set to 256 Start 2281: mpi_dst_example_simple_lap_c_facto2_sched0_not_pqrcpend Test #2282: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpbegin ..............***Timeout 414.97 sec Start 2282: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpbegin Test #2283: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpend ................***Timeout 414.97 sec Start 2283: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpend Test #2284: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpbegin ...***Timeout 414.97 sec ischedInit: The thread number has been automatically set to 256 Start 2284: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpbegin Test #2285: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpend .....***Timeout 414.97 sec Start 2285: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpend Test #2286: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrcpbegin ...............***Timeout 414.97 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2286: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrcpbegin Test #2287: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrcpend .................***Timeout 414.97 sec ischedInit: The thread number has been automatically set to 256 Start 2287: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrcpend Test #2288: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrcpbegin ..............***Timeout 414.97 sec Start 2288: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrcpbegin Test #2289: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrcpend ................***Timeout 414.96 sec ischedInit: The thread number has been automatically set to 256 Start 2289: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrcpend Test #2290: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpbegin ...***Timeout 414.96 sec ischedInit: The thread number has been automatically set to 256 Start 2290: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpbegin Test #2291: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpend .....***Timeout 414.96 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2291: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpend Test #2292: mpi_dst_example_simple_lap_c_facto2_sched0_not_tqrcpbegin ...............***Timeout 414.96 sec ischedInit: The thread number has been automatically set to 256 Start 2292: mpi_dst_example_simple_lap_c_facto2_sched0_not_tqrcpbegin Test #2293: mpi_dst_example_simple_lap_c_facto2_sched0_not_tqrcpend .................***Timeout 414.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2293: mpi_dst_example_simple_lap_c_facto2_sched0_not_tqrcpend Test #2294: mpi_dst_example_simple_lap_c_facto2_sched0_kway_tqrcpbegin ..............***Timeout 414.96 sec ischedInit: The thread number has been automatically set to 256 Start 2294: mpi_dst_example_simple_lap_c_facto2_sched0_kway_tqrcpbegin Test #2295: mpi_dst_example_simple_lap_c_facto2_sched0_kway_tqrcpend ................***Timeout 414.96 sec ischedInit: The thread number has been automatically set to 256 Start 2295: mpi_dst_example_simple_lap_c_facto2_sched0_kway_tqrcpend Test #2297: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpend .....***Timeout 414.95 sec Start 2297: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpend Test #2298: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrrtbegin ...............***Timeout 414.95 sec Start 2298: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrrtbegin Test #2299: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrrtend .................***Timeout 414.94 sec Start 2299: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrrtend Test #2300: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrrtbegin ..............***Timeout 414.94 sec ischedInit: The thread number has been automatically set to 256 Start 2300: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrrtbegin Test #2301: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrrtend ................***Timeout 414.94 sec ischedInit: The thread number has been automatically set to 256 Start 2301: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrrtend Test #2302: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtbegin ...***Timeout 414.94 sec Start 2302: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtbegin Test #2303: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtend .....***Timeout 414.94 sec Start 2303: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtend Test #2304: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpilu0 ...............***Timeout 414.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2304: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpilu0 Test #2305: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpilu1 ...............***Timeout 414.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2305: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpilu1 Test #2307: mpi_dst_example_simple_lap_c_facto3_sched0_not_svdend ...................***Timeout 414.92 sec Start 2307: mpi_dst_example_simple_lap_c_facto3_sched0_not_svdend Test #2309: mpi_dst_example_simple_lap_c_facto3_sched0_kway_svdend ..................***Timeout 414.90 sec Start 2309: mpi_dst_example_simple_lap_c_facto3_sched0_kway_svdend Test #2311: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_svdend .......***Timeout 414.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2311: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_svdend Test #2312: mpi_dst_example_simple_lap_c_facto3_sched0_not_pqrcpbegin ...............***Timeout 414.89 sec Start 2312: mpi_dst_example_simple_lap_c_facto3_sched0_not_pqrcpbegin Test #2313: mpi_dst_example_simple_lap_c_facto3_sched0_not_pqrcpend .................***Timeout 414.88 sec ischedInit: The thread number has been automatically set to 256 Start 2313: mpi_dst_example_simple_lap_c_facto3_sched0_not_pqrcpend Test #2315: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpend ................***Timeout 414.88 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2315: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpend Test #2316: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpbegin ...***Timeout 414.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.303960e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.921101e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.692431e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.218860e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.484252e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.082230e-01 s Time to initialize coeftab 8.719323e-01 s Time to factorize 4.700990e+00 s ( 4.31 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.094814e-02 s - iteration 1 : total iteration time 0.158 s error 6.309e-11 Time for refinement 3.441992e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.629023e-08 max(|| b_i - A x_i ||_1) 3.381212e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.531976e-01 (SUCCESS) Start 2316: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpbegin Test #2317: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpend .....***Timeout 414.87 sec Start 2317: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpend Test #2318: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrcpbegin ...............***Timeout 414.87 sec ischedInit: The thread number has been automatically set to 256 Start 2318: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrcpbegin Test #2319: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrcpend .................***Timeout 414.87 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2319: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrcpend Test #2320: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrcpbegin ..............***Timeout 414.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2320: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrcpbegin Test #2321: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrcpend ................***Timeout 414.87 sec Start 2321: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrcpend Test #2322: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpbegin ...***Timeout 414.87 sec Start 2322: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpbegin Test #2323: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpend .....***Timeout 414.86 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2323: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpend Test #2324: mpi_dst_example_simple_lap_c_facto3_sched0_not_tqrcpbegin ...............***Timeout 414.86 sec Start 2324: mpi_dst_example_simple_lap_c_facto3_sched0_not_tqrcpbegin Test #2325: mpi_dst_example_simple_lap_c_facto3_sched0_not_tqrcpend .................***Timeout 414.86 sec Start 2325: mpi_dst_example_simple_lap_c_facto3_sched0_not_tqrcpend Test #2326: mpi_dst_example_simple_lap_c_facto3_sched0_kway_tqrcpbegin ..............***Timeout 414.86 sec Start 2326: mpi_dst_example_simple_lap_c_facto3_sched0_kway_tqrcpbegin Test #2327: mpi_dst_example_simple_lap_c_facto3_sched0_kway_tqrcpend ................***Timeout 414.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2327: mpi_dst_example_simple_lap_c_facto3_sched0_kway_tqrcpend Test #2328: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpbegin ...***Timeout 414.86 sec Start 2328: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpbegin Test #2329: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpend .....***Timeout 414.85 sec ischedInit: The thread number has been automatically set to 256 Start 2329: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpend Test #2330: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrrtbegin ...............***Timeout 414.85 sec ischedInit: The thread number has been automatically set to 256 Start 2330: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrrtbegin Test #2331: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrrtend .................***Timeout 414.85 sec Start 2331: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrrtend Test #2332: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrrtbegin ..............***Timeout 414.85 sec Start 2332: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrrtbegin Test #2333: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrrtend ................***Timeout 414.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2333: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrrtend Test #2334: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtbegin ...***Timeout 414.84 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2334: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtbegin Test #2335: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtend .....***Timeout 414.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2335: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtend Test #2336: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpilu0 ...............***Timeout 414.84 sec Start 2336: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpilu0 Test #2337: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpilu1 ...............***Timeout 414.84 sec ischedInit: The thread number has been automatically set to 256 Start 2337: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpilu1 Test #2339: mpi_dst_example_simple_lap_c_facto4_sched0_not_svdend ...................***Timeout 414.83 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2339: mpi_dst_example_simple_lap_c_facto4_sched0_not_svdend Test #2340: mpi_dst_example_simple_lap_c_facto4_sched0_kway_svdbegin ................***Timeout 414.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2340: mpi_dst_example_simple_lap_c_facto4_sched0_kway_svdbegin Test #2341: mpi_dst_example_simple_lap_c_facto4_sched0_kway_svdend ..................***Timeout 414.82 sec Start 2341: mpi_dst_example_simple_lap_c_facto4_sched0_kway_svdend Test #2343: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_svdend .......***Timeout 414.81 sec Start 2343: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_svdend Test #2344: mpi_dst_example_simple_lap_c_facto4_sched0_not_pqrcpbegin ...............***Timeout 414.81 sec Start 2344: mpi_dst_example_simple_lap_c_facto4_sched0_not_pqrcpbegin Test #2346: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpbegin ..............***Timeout 414.80 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2346: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpbegin Test #2347: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpend ................***Timeout 414.80 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2347: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpend Test #2348: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpbegin ...***Timeout 414.80 sec Start 2348: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpbegin Test #2349: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpend .....***Timeout 414.80 sec Start 2349: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpend Test #2105: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpend .....***Timeout 414.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2106: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrrtbegin ...............***Timeout 414.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2107: mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrrtend .................***Timeout 414.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2108: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrrtbegin ..............***Timeout 414.69 sec ischedInit: The thread number has been automatically set to 256 Test #2109: mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrrtend ................***Timeout 414.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2110: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtbegin ...***Timeout 414.68 sec ischedInit: The thread number has been automatically set to 256 Test #2111: mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtend .....***Timeout 414.67 sec ischedInit: The thread number has been automatically set to 256 Test #2112: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpilu0 ...............***Timeout 414.66 sec Test #2113: mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpilu1 ...............***Timeout 414.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2114: mpi_dst_example_simple_lap_d_facto0_sched0_not_svdbegin .................***Timeout 414.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2115: mpi_dst_example_simple_lap_d_facto0_sched0_not_svdend ...................***Timeout 414.65 sec Test #2116: mpi_dst_example_simple_lap_d_facto0_sched0_kway_svdbegin ................***Timeout 414.64 sec ischedInit: The thread number has been automatically set to 256 Test #2117: mpi_dst_example_simple_lap_d_facto0_sched0_kway_svdend ..................***Timeout 414.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2118: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_svdbegin .....***Timeout 414.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2119: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_svdend .......***Timeout 414.62 sec ischedInit: The thread number has been automatically set to 256 Test #2120: mpi_dst_example_simple_lap_d_facto0_sched0_not_pqrcpbegin ...............***Timeout 414.62 sec Test #2121: mpi_dst_example_simple_lap_d_facto0_sched0_not_pqrcpend .................***Timeout 414.61 sec ischedInit: The thread number has been automatically set to 256 Test #2122: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpbegin ..............***Timeout 414.61 sec ischedInit: The thread number has been automatically set to 256 Test #2123: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpend ................***Timeout 414.60 sec ischedInit: The thread number has been automatically set to 256 Test #2124: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpbegin ...***Timeout 414.59 sec Test #2125: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpend .....***Timeout 414.59 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2126: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrcpbegin ...............***Timeout 414.58 sec Test #2127: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrcpend .................***Timeout 414.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2128: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrcpbegin ..............***Timeout 414.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2129: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrcpend ................***Timeout 414.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2130: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpbegin ...***Timeout 414.55 sec Test #2131: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpend .....***Timeout 414.55 sec ischedInit: The thread number has been automatically set to 256 Test #2132: mpi_dst_example_simple_lap_d_facto0_sched0_not_tqrcpbegin ...............***Timeout 414.54 sec ischedInit: The thread number has been automatically set to 256 Test #2133: mpi_dst_example_simple_lap_d_facto0_sched0_not_tqrcpend .................***Timeout 414.54 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2134: mpi_dst_example_simple_lap_d_facto0_sched0_kway_tqrcpbegin ..............***Timeout 414.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 2187/3626 Test #2388: mpi_dst_example_simple_lap_z_facto0_sched0_not_tqrcpbegin ...............***Timeout 414.52 sec ischedInit: The thread number has been automatically set to 256 Start 2388: mpi_dst_example_simple_lap_z_facto0_sched0_not_tqrcpbegin 2187/3626 Test #2389: mpi_dst_example_simple_lap_z_facto0_sched0_not_tqrcpend .................***Timeout 414.52 sec Start 2389: mpi_dst_example_simple_lap_z_facto0_sched0_not_tqrcpend 2187/3626 Test #2390: mpi_dst_example_simple_lap_z_facto0_sched0_kway_tqrcpbegin ..............***Timeout 414.52 sec ischedInit: The thread number has been automatically set to 256 Start 2390: mpi_dst_example_simple_lap_z_facto0_sched0_kway_tqrcpbegin 2187/3626 Test #2391: mpi_dst_example_simple_lap_z_facto0_sched0_kway_tqrcpend ................***Timeout 414.52 sec ischedInit: The thread number has been automatically set to 256 Start 2391: mpi_dst_example_simple_lap_z_facto0_sched0_kway_tqrcpend 2187/3626 Test #2392: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpbegin ...***Timeout 414.52 sec Start 2392: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpbegin 2187/3626 Test #2393: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpend .....***Timeout 414.52 sec ischedInit: The thread number has been automatically set to 256 Start 2393: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpend 2187/3626 Test #2394: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrrtbegin ...............***Timeout 414.52 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2394: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrrtbegin 2187/3626 Test #2395: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrrtend .................***Timeout 414.52 sec ischedInit: The thread number has been automatically set to 256 Start 2395: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrrtend 2187/3626 Test #2396: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrrtbegin ..............***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 Start 2396: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrrtbegin 2187/3626 Test #2397: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrrtend ................***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2397: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrrtend 2187/3626 Test #2398: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtbegin ...***Timeout 414.51 sec Start 2398: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtbegin 2187/3626 Test #2399: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtend .....***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2399: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtend 2187/3626 Test #2400: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpilu0 ...............***Timeout 414.51 sec Start 2400: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpilu0 2187/3626 Test #2401: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpilu1 ...............***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 Start 2401: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpilu1 2187/3626 Test #2402: mpi_dst_example_simple_lap_z_facto1_sched0_not_svdbegin .................***Timeout 414.51 sec Start 2402: mpi_dst_example_simple_lap_z_facto1_sched0_not_svdbegin 2187/3626 Test #2403: mpi_dst_example_simple_lap_z_facto1_sched0_not_svdend ...................***Timeout 414.51 sec Start 2403: mpi_dst_example_simple_lap_z_facto1_sched0_not_svdend 2187/3626 Test #2404: mpi_dst_example_simple_lap_z_facto1_sched0_kway_svdbegin ................***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2404: mpi_dst_example_simple_lap_z_facto1_sched0_kway_svdbegin 2187/3626 Test #2405: mpi_dst_example_simple_lap_z_facto1_sched0_kway_svdend ..................***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2405: mpi_dst_example_simple_lap_z_facto1_sched0_kway_svdend 2187/3626 Test #2406: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_svdbegin .....***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2406: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_svdbegin 2187/3626 Test #2407: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_svdend .......***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 Start 2407: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_svdend 2187/3626 Test #2408: mpi_dst_example_simple_lap_z_facto1_sched0_not_pqrcpbegin ...............***Timeout 414.50 sec ischedInit: The thread number has been automatically set to 256 Start 2408: mpi_dst_example_simple_lap_z_facto1_sched0_not_pqrcpbegin 2187/3626 Test #2409: mpi_dst_example_simple_lap_z_facto1_sched0_not_pqrcpend .................***Timeout 414.50 sec ischedInit: The thread number has been automatically set to 256 Start 2409: mpi_dst_example_simple_lap_z_facto1_sched0_not_pqrcpend 2187/3626 Test #2410: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpbegin ..............***Timeout 414.50 sec Start 2410: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpbegin 2187/3626 Test #2411: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpend ................***Timeout 414.50 sec Start 2411: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpend 2187/3626 Test #2412: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpbegin ...***Timeout 414.50 sec ischedInit: The thread number has been automatically set to 256 Start 2412: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpbegin 2187/3626 Test #2413: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpend .....***Timeout 414.50 sec ischedInit: The thread number has been automatically set to 256 Start 2413: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpend 2187/3626 Test #2414: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrcpbegin ...............***Timeout 414.50 sec ischedInit: The thread number has been automatically set to 256 Start 2414: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrcpbegin 2187/3626 Test #2415: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrcpend .................***Timeout 414.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2415: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrcpend 2187/3626 Test #2416: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrcpbegin ..............***Timeout 414.51 sec Start 2416: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrcpbegin 2187/3626 Test #2417: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrcpend ................***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 Start 2417: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrcpend 2187/3626 Test #2418: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpbegin ...***Timeout 414.51 sec Start 2418: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpbegin 2187/3626 Test #2419: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpend .....***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2419: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpend 2187/3626 Test #2420: mpi_dst_example_simple_lap_z_facto1_sched0_not_tqrcpbegin ...............***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 Start 2420: mpi_dst_example_simple_lap_z_facto1_sched0_not_tqrcpbegin 2187/3626 Test #2421: mpi_dst_example_simple_lap_z_facto1_sched0_not_tqrcpend .................***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2421: mpi_dst_example_simple_lap_z_facto1_sched0_not_tqrcpend 2187/3626 Test #2422: mpi_dst_example_simple_lap_z_facto1_sched0_kway_tqrcpbegin ..............***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 Start 2422: mpi_dst_example_simple_lap_z_facto1_sched0_kway_tqrcpbegin 2187/3626 Test #2423: mpi_dst_example_simple_lap_z_facto1_sched0_kway_tqrcpend ................***Timeout 414.51 sec Start 2423: mpi_dst_example_simple_lap_z_facto1_sched0_kway_tqrcpend 2187/3626 Test #2424: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpbegin ...***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 Start 2424: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpbegin 2187/3626 Test #2425: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpend .....***Timeout 414.51 sec ischedInit: The thread number has been automatically set to 256 Start 2425: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpend 2187/3626 Test #2427: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrrtend .................***Timeout 414.50 sec ischedInit: The thread number has been automatically set to 256 Start 2427: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrrtend 2187/3626 Test #2428: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrrtbegin ..............***Timeout 414.50 sec ischedInit: The thread number has been automatically set to 256 Start 2428: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrrtbegin 2187/3626 Test #2429: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrrtend ................***Timeout 414.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2429: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrrtend 2187/3626 Test #2430: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtbegin ...***Timeout 414.50 sec Start 2430: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtbegin 2187/3626 Test #2431: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtend .....***Timeout 414.50 sec Start 2431: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtend 2187/3626 Test #2432: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpilu0 ...............***Timeout 414.50 sec Start 2432: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpilu0 2187/3626 Test #2433: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpilu1 ...............***Timeout 414.50 sec ischedInit: The thread number has been automatically set to 256 Start 2433: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpilu1 2187/3626 Test #2434: mpi_dst_example_simple_lap_z_facto2_sched0_not_svdbegin .................***Timeout 414.50 sec Start 2434: mpi_dst_example_simple_lap_z_facto2_sched0_not_svdbegin 2187/3626 Test #2435: mpi_dst_example_simple_lap_z_facto2_sched0_not_svdend ...................***Timeout 414.50 sec Start 2435: mpi_dst_example_simple_lap_z_facto2_sched0_not_svdend 2187/3626 Test #2436: mpi_dst_example_simple_lap_z_facto2_sched0_kway_svdbegin ................***Timeout 414.50 sec ischedInit: The thread number has been automatically set to 256 Start 2436: mpi_dst_example_simple_lap_z_facto2_sched0_kway_svdbegin 2187/3626 Test #2438: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_svdbegin .....***Timeout 414.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2438: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_svdbegin 2187/3626 Test #2439: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_svdend .......***Timeout 414.49 sec ischedInit: The thread number has been automatically set to 256 Start 2439: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_svdend 2187/3626 Test #2440: mpi_dst_example_simple_lap_z_facto2_sched0_not_pqrcpbegin ...............***Timeout 414.49 sec ischedInit: The thread number has been automatically set to 256 Start 2440: mpi_dst_example_simple_lap_z_facto2_sched0_not_pqrcpbegin 2187/3626 Test #2441: mpi_dst_example_simple_lap_z_facto2_sched0_not_pqrcpend .................***Timeout 414.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2441: mpi_dst_example_simple_lap_z_facto2_sched0_not_pqrcpend 2187/3626 Test #2442: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpbegin ..............***Timeout 414.48 sec Start 2442: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpbegin 2187/3626 Test #2443: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpend ................***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2443: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpend 2187/3626 Test #2444: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpbegin ...***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2444: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpbegin 2187/3626 Test #2445: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpend .....***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2445: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpend 2187/3626 Test #2446: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrcpbegin ...............***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2446: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrcpbegin 2187/3626 Test #2447: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrcpend .................***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2447: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrcpend 2187/3626 Test #2448: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrcpbegin ..............***Timeout 414.48 sec Start 2448: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrcpbegin 2187/3626 Test #2449: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrcpend ................***Timeout 414.48 sec Start 2449: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrcpend 2187/3626 Test #2450: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpbegin ...***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2450: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpbegin 2187/3626 Test #2451: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpend .....***Timeout 414.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2451: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpend 2187/3626 Test #2452: mpi_dst_example_simple_lap_z_facto2_sched0_not_tqrcpbegin ...............***Timeout 414.47 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2452: mpi_dst_example_simple_lap_z_facto2_sched0_not_tqrcpbegin 2187/3626 Test #2453: mpi_dst_example_simple_lap_z_facto2_sched0_not_tqrcpend .................***Timeout 414.47 sec Start 2453: mpi_dst_example_simple_lap_z_facto2_sched0_not_tqrcpend 2187/3626 Test #2454: mpi_dst_example_simple_lap_z_facto2_sched0_kway_tqrcpbegin ..............***Timeout 414.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2454: mpi_dst_example_simple_lap_z_facto2_sched0_kway_tqrcpbegin 2187/3626 Test #2455: mpi_dst_example_simple_lap_z_facto2_sched0_kway_tqrcpend ................***Timeout 414.47 sec Start 2455: mpi_dst_example_simple_lap_z_facto2_sched0_kway_tqrcpend 2187/3626 Test #2456: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpbegin ...***Timeout 414.47 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2456: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpbegin 2187/3626 Test #2457: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpend .....***Timeout 414.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2457: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpend 2187/3626 Test #2458: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrrtbegin ...............***Timeout 414.47 sec ischedInit: The thread number has been automatically set to 256 Start 2458: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrrtbegin 2187/3626 Test #2459: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrrtend .................***Timeout 414.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2459: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrrtend 2187/3626 Test #2460: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrrtbegin ..............***Timeout 414.47 sec Start 2460: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrrtbegin 2187/3626 Test #2461: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrrtend ................***Timeout 414.48 sec Start 2461: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrrtend 2187/3626 Test #2462: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtbegin ...***Timeout 414.48 sec Start 2462: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtbegin 2187/3626 Test #2463: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtend .....***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2463: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtend 2187/3626 Test #2464: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpilu0 ...............***Timeout 414.48 sec Start 2464: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpilu0 2187/3626 Test #2465: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpilu1 ...............***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 Start 2465: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpilu1 2187/3626 Test #2466: mpi_dst_example_simple_lap_z_facto3_sched0_not_svdbegin .................***Timeout 414.48 sec Start 2466: mpi_dst_example_simple_lap_z_facto3_sched0_not_svdbegin 2187/3626 Test #2467: mpi_dst_example_simple_lap_z_facto3_sched0_not_svdend ...................***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2467: mpi_dst_example_simple_lap_z_facto3_sched0_not_svdend 2187/3626 Test #2468: mpi_dst_example_simple_lap_z_facto3_sched0_kway_svdbegin ................***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 Start 2468: mpi_dst_example_simple_lap_z_facto3_sched0_kway_svdbegin 2187/3626 Test #2469: mpi_dst_example_simple_lap_z_facto3_sched0_kway_svdend ..................***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 Start 2469: mpi_dst_example_simple_lap_z_facto3_sched0_kway_svdend 2187/3626 Test #2470: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_svdbegin .....***Timeout 414.48 sec Start 2470: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_svdbegin 2187/3626 Test #2471: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_svdend .......***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2471: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_svdend 2187/3626 Test #2472: mpi_dst_example_simple_lap_z_facto3_sched0_not_pqrcpbegin ...............***Timeout 414.48 sec Start 2472: mpi_dst_example_simple_lap_z_facto3_sched0_not_pqrcpbegin 2187/3626 Test #2473: mpi_dst_example_simple_lap_z_facto3_sched0_not_pqrcpend .................***Timeout 414.48 sec Start 2473: mpi_dst_example_simple_lap_z_facto3_sched0_not_pqrcpend 2187/3626 Test #2474: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpbegin ..............***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2474: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpbegin 2187/3626 Test #2475: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpend ................***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2475: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpend 2187/3626 Test #2476: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpbegin ...***Timeout 414.48 sec Start 2476: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpbegin 2187/3626 Test #2477: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpend .....***Timeout 414.48 sec ischedInit: The thread number has been automatically set to 256 Start 2477: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpend Test #2356: mpi_dst_example_simple_lap_c_facto4_sched0_not_tqrcpbegin ...............***Timeout 414.47 sec ischedInit: The thread number has been automatically set to 256 Start 2356: mpi_dst_example_simple_lap_c_facto4_sched0_not_tqrcpbegin Test #2232: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpbegin ...***Timeout 414.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2232: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpbegin Test #2242: mpi_dst_example_simple_lap_c_facto1_sched0_not_svdbegin .................***Timeout 414.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2242: mpi_dst_example_simple_lap_c_facto1_sched0_not_svdbegin Test #2244: mpi_dst_example_simple_lap_c_facto1_sched0_kway_svdbegin ................***Timeout 414.44 sec Start 2244: mpi_dst_example_simple_lap_c_facto1_sched0_kway_svdbegin Test #2262: mpi_dst_example_simple_lap_c_facto1_sched0_kway_tqrcpbegin ..............***Timeout 414.44 sec Start 2262: mpi_dst_example_simple_lap_c_facto1_sched0_kway_tqrcpbegin Test #2306: mpi_dst_example_simple_lap_c_facto3_sched0_not_svdbegin .................***Timeout 414.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2306: mpi_dst_example_simple_lap_c_facto3_sched0_not_svdbegin Test #2308: mpi_dst_example_simple_lap_c_facto3_sched0_kway_svdbegin ................***Timeout 414.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2308: mpi_dst_example_simple_lap_c_facto3_sched0_kway_svdbegin Test #2310: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_svdbegin .....***Timeout 414.44 sec Start 2310: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_svdbegin Test #2338: mpi_dst_example_simple_lap_c_facto4_sched0_not_svdbegin .................***Timeout 414.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2338: mpi_dst_example_simple_lap_c_facto4_sched0_not_svdbegin Test #2342: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_svdbegin .....***Timeout 414.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2342: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_svdbegin 2187/3626 Test #2478: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrcpbegin ...............***Timeout 414.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2478: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrcpbegin 2187/3626 Test #2479: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrcpend .................***Timeout 414.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2479: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrcpend 2187/3626 Test #2480: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrcpbegin ..............***Timeout 414.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2480: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrcpbegin 2187/3626 Test #2481: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrcpend ................***Timeout 414.43 sec Start 2481: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrcpend 2187/3626 Test #2482: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpbegin ...***Timeout 414.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2482: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpbegin 2187/3626 Test #2483: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpend .....***Timeout 414.43 sec Start 2483: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpend 2187/3626 Test #2484: mpi_dst_example_simple_lap_z_facto3_sched0_not_tqrcpbegin ...............***Timeout 414.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2484: mpi_dst_example_simple_lap_z_facto3_sched0_not_tqrcpbegin 2187/3626 Test #2485: mpi_dst_example_simple_lap_z_facto3_sched0_not_tqrcpend .................***Timeout 414.43 sec ischedInit: The thread number has been automatically set to 256 Start 2485: mpi_dst_example_simple_lap_z_facto3_sched0_not_tqrcpend 2187/3626 Test #2486: mpi_dst_example_simple_lap_z_facto3_sched0_kway_tqrcpbegin ..............***Timeout 414.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2486: mpi_dst_example_simple_lap_z_facto3_sched0_kway_tqrcpbegin 2187/3626 Test #2487: mpi_dst_example_simple_lap_z_facto3_sched0_kway_tqrcpend ................***Timeout 414.44 sec ischedInit: The thread number has been automatically set to 256 Start 2487: mpi_dst_example_simple_lap_z_facto3_sched0_kway_tqrcpend 2187/3626 Test #2488: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpbegin ...***Timeout 414.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2488: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpbegin 2187/3626 Test #2489: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpend .....***Timeout 414.43 sec Start 2489: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpend 2187/3626 Test #2490: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrrtbegin ...............***Timeout 414.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2490: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrrtbegin 2187/3626 Test #2491: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrrtend .................***Timeout 414.43 sec ischedInit: The thread number has been automatically set to 256 Start 2491: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrrtend 2187/3626 Test #2492: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrrtbegin ..............***Timeout 414.43 sec ischedInit: The thread number has been automatically set to 256 Start 2492: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrrtbegin 2187/3626 Test #2493: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrrtend ................***Timeout 414.43 sec ischedInit: The thread number has been automatically set to 256 Start 2493: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrrtend 2187/3626 Test #2494: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtbegin ...***Timeout 414.43 sec ischedInit: The thread number has been automatically set to 256 Start 2494: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtbegin 2187/3626 Test #2495: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtend .....***Timeout 414.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2495: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtend 2187/3626 Test #2496: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpilu0 ...............***Timeout 414.43 sec Start 2496: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpilu0 2187/3626 Test #2497: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpilu1 ...............***Timeout 414.43 sec Start 2497: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpilu1 2187/3626 Test #2498: mpi_dst_example_simple_lap_z_facto4_sched0_not_svdbegin .................***Timeout 414.43 sec Start 2498: mpi_dst_example_simple_lap_z_facto4_sched0_not_svdbegin 2187/3626 Test #2500: mpi_dst_example_simple_lap_z_facto4_sched0_kway_svdbegin ................***Timeout 414.42 sec ischedInit: The thread number has been automatically set to 256 Start 2500: mpi_dst_example_simple_lap_z_facto4_sched0_kway_svdbegin 2187/3626 Test #2501: mpi_dst_example_simple_lap_z_facto4_sched0_kway_svdend ..................***Timeout 414.42 sec ischedInit: The thread number has been automatically set to 256 Start 2501: mpi_dst_example_simple_lap_z_facto4_sched0_kway_svdend 2187/3626 Test #2502: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_svdbegin .....***Timeout 414.42 sec ischedInit: The thread number has been automatically set to 256 Start 2502: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_svdbegin 2187/3626 Test #2503: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_svdend .......***Timeout 414.42 sec ischedInit: The thread number has been automatically set to 256 Start 2503: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_svdend 2187/3626 Test #2504: mpi_dst_example_simple_lap_z_facto4_sched0_not_pqrcpbegin ...............***Timeout 414.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2504: mpi_dst_example_simple_lap_z_facto4_sched0_not_pqrcpbegin 2187/3626 Test #2505: mpi_dst_example_simple_lap_z_facto4_sched0_not_pqrcpend .................***Timeout 414.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2505: mpi_dst_example_simple_lap_z_facto4_sched0_not_pqrcpend 2187/3626 Test #2506: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpbegin ..............***Timeout 414.42 sec ischedInit: The thread number has been automatically set to 256 Start 2506: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpbegin 2187/3626 Test #2507: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpend ................***Timeout 414.42 sec ischedInit: The thread number has been automatically set to 256 Start 2507: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpend 2187/3626 Test #2508: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpbegin ...***Timeout 414.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2508: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpbegin 2187/3626 Test #2509: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpend .....***Timeout 414.41 sec Start 2509: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpend 2187/3626 Test #2510: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrcpbegin ...............***Timeout 414.40 sec ischedInit: The thread number has been automatically set to 256 Start 2510: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrcpbegin 2187/3626 Test #2511: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrcpend .................***Timeout 414.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2511: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrcpend 2187/3626 Test #2512: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrcpbegin ..............***Timeout 414.38 sec Start 2512: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrcpbegin 2187/3626 Test #2513: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrcpend ................***Timeout 414.37 sec ischedInit: The thread number has been automatically set to 256 Start 2513: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrcpend 2187/3626 Test #2514: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpbegin ...***Timeout 414.36 sec Start 2514: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpbegin 2187/3626 Test #2515: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpend .....***Timeout 414.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2515: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpend 2187/3626 Test #2516: mpi_dst_example_simple_lap_z_facto4_sched0_not_tqrcpbegin ...............***Timeout 337.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.074970e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.318790e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.735647e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.535988e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.296479e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.670416e-01 s Time to initialize coeftab 7.529216e-01 s Time to factorize 2.068058e+01 s ( 1.03 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 7.718465e-03 s - iteration 1 : total iteration time 0.00606 s error 1.5512e-14 Time for refinement 1.363542e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) Start 2516: mpi_dst_example_simple_lap_z_facto4_sched0_not_tqrcpbegin 2187/3626 Test #2517: mpi_dst_example_simple_lap_z_facto4_sched0_not_tqrcpend .................***Timeout 337.67 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.417611e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.239437e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.704495e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.290644e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.560930e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.992927e-01 s Time to initialize coeftab 4.569173e-01 s Time to factorize 1.141351e+01 s ( 1.87 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 4.698436e-01 s - iteration 1 : total iteration time 0.655 s error 3.419e-16 Time for refinement 1.220433e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613349e-16 max(|| b_i - A x_i ||_1) 8.485980e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141301e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613349e-16 max(|| b_i - A x_i ||_1) 8.485980e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141301e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613349e-16 max(|| b_i - A x_i ||_1) 8.485980e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141301e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613349e-16 max(|| b_i - A x_i ||_1) 8.485980e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141301e-03 (SUCCESS) Start 2517: mpi_dst_example_simple_lap_z_facto4_sched0_not_tqrcpend 2187/3626 Test #2518: mpi_dst_example_simple_lap_z_facto4_sched0_kway_tqrcpbegin ..............***Timeout 334.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.171591e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.512785e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.232688e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.010146e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.300680e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.615879e-01 s Time to initialize coeftab 3.926782e-01 s Time to factorize 1.114717e+01 s ( 1.91 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.171586e-01 s - iteration 1 : total iteration time 0.2 s error 1.5512e-14 Time for refinement 4.642808e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550836e-14 max(|| b_i - A x_i ||_1) 2.328995e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876843e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550836e-14 max(|| b_i - A x_i ||_1) 2.328995e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876843e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550836e-14 max(|| b_i - A x_i ||_1) 2.328995e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876843e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550836e-14 max(|| b_i - A x_i ||_1) 2.328995e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876843e-02 (SUCCESS) Start 2518: mpi_dst_example_simple_lap_z_facto4_sched0_kway_tqrcpbegin 2187/3626 Test #2519: mpi_dst_example_simple_lap_z_facto4_sched0_kway_tqrcpend ................***Timeout 334.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.971264e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.840173e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.086297e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.883918e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.167195e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.959687e-01 s Time to initialize coeftab 2.494288e-01 s Time to factorize 7.481654e+00 s ( 2.85 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 4.808079e-01 s - iteration 1 : total iteration time 0.237 s error 3.419e-16 Time for refinement 5.679183e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) Start 2519: mpi_dst_example_simple_lap_z_facto4_sched0_kway_tqrcpend 2187/3626 Test #2520: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpbegin ...***Timeout 332.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.104367e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.719351e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.805416e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.387364e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.274649e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.150297e-01 s Time to initialize coeftab 5.020575e-01 s Time to factorize 1.455489e+01 s ( 1.46 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.039245e-02 s - iteration 1 : total iteration time 0.00907 s error 1.5512e-14 Time for refinement 2.144052e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550836e-14 max(|| b_i - A x_i ||_1) 2.328995e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876843e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550836e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550836e-14 max(|| b_i - A x_i ||_1) 2.328995e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876843e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.328995e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876843e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550836e-14 max(|| b_i - A x_i ||_1) 2.328995e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876843e-02 (SUCCESS) Start 2520: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpbegin 2187/3626 Test #2521: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpend .....***Timeout 331.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 [arch-nspawn-3655178:1687563] *** Process received signal *** [arch-nspawn-3655178:1687563] Signal: Segmentation fault (11) [arch-nspawn-3655178:1687563] Signal code: Address not mapped (1) [arch-nspawn-3655178:1687563] Failing at address: 0x7f8c6ffa2860 [arch-nspawn-3655178:1687563] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7fd92daca6cc] [arch-nspawn-3655178:1687563] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7fd91ffa3a02] [arch-nspawn-3655178:1687563] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7fd91ffa4504] [arch-nspawn-3655178:1687563] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7fd91ff55a7a] [arch-nspawn-3655178:1687563] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7fd91ff82aa2] [arch-nspawn-3655178:1687563] [ 5] /usr/lib/libmpi.so.40(+0x7de1a) [0x7fd91fc7de1a] [arch-nspawn-3655178:1687563] [ 6] /usr/lib/libmpi.so.40(ompi_request_default_wait+0x1a) [0x7fd91fc8019c] [arch-nspawn-3655178:1687563] [ 7] /usr/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0x98) [0x7fd91fcf03e8] [arch-nspawn-3655178:1687563] [ 8] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_recursivedoubling+0x210) [0x7fd91fcf1a88] [arch-nspawn-3655178:1687563] [ 9] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_ring+0x3fc) [0x7fd91fcf443c] [arch-nspawn-3655178:1687563] [10] /usr/lib/libmpi.so.40(ompi_coll_tuned_allreduce_intra_dec_fixed+0x40) [0x7fd91fd15152] [arch-nspawn-3655178:1687563] [11] /usr/lib/libmpi.so.40(MPI_Allreduce+0x294) [0x7fd91fc8e584] [arch-nspawn-3655178:1687563] [12] /build/pastix/src/build/spm/src/libspm.so.1(spmUpdateComputedFields+0x140) [0x7fd92cc97458] [arch-nspawn-3655178:1687563] [13] /build/pastix/src/build/spm/src/libspm.so.1(genLaplacian+0xaa) [0x7fd92cca021e] [arch-nspawn-3655178:1687563] [14] /build/pastix/src/build/spm/src/libspm.so.1(+0x409c8) [0x7fd92cca19c8] [arch-nspawn-3655178:1687563] [15] ./simple(+0xe2c) [0x555555556e2c] [arch-nspawn-3655178:1687563] [16] /usr/lib/libc.so.6(+0x27fae) [0x7fd9242a3fae] [arch-nspawn-3655178:1687563] [17] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7fd9242a40b8] [arch-nspawn-3655178:1687563] [18] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:1687563] *** End of error message *** -------------------------------------------------------------------------- prte noticed that process rank 0 with PID 1687563 on node arch-nspawn-3655178 exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- Start 2521: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpend 2187/3626 Test #2522: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrrtbegin ...............***Timeout 327.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.367385e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.077184e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.413584e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.087822e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.468758e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.594153e-01 s Time to initialize coeftab 5.468002e-01 s Time to factorize 5.993973e+00 s ( 3.55 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 6.971776e-02 s - iteration 1 : total iteration time 0.107 s error 4.5455e-09 - iteration 2 : total iteration time 0.103 s error 5.4821e-10 - iteration 3 : total iteration time 0.0773 s error 3.7667e-13 Time for refinement 3.782085e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) Start 2522: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrrtbegin 2187/3626 Test #2523: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrrtend .................***Timeout 322.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.079401e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.342940e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.021908e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.034177e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.175867e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.455726e-01 s Time to initialize coeftab 1.477685e-01 s Time to factorize 3.556009e+00 s ( 5.99 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 2.480074e-01 s - iteration 1 : total iteration time 0.299 s error 3.6647e-16 Time for refinement 6.971530e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.922723e-16 max(|| b_i - A x_i ||_1) 9.386057e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.368421e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.922723e-16 max(|| b_i - A x_i ||_1) 9.386057e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.368421e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.922723e-16 max(|| b_i - A x_i ||_1) 9.386057e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.368421e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.922723e-16 max(|| b_i - A x_i ||_1) 9.386057e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.368421e-03 (SUCCESS) Start 2523: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrrtend 2187/3626 Test #2524: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrrtbegin ..............***Timeout 322.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.339471e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.291346e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.253279e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.560756e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.720745e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 9.040946e-01 s Time to initialize coeftab 7.052406e-01 s Time to factorize 1.486866e+01 s ( 1.43 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 8.922892e-03 s - iteration 1 : total iteration time 0.00855 s error 4.5455e-09 - iteration 2 : total iteration time 0.00522 s error 5.4821e-10 - iteration 3 : total iteration time 0.00532 s error 3.7667e-13 Time for refinement 2.831071e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) Start 2524: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrrtbegin 2187/3626 Test #2525: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrrtend ................***Timeout 321.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.103443e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.789103e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.418569e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.959532e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.310103e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 7.053253e-01 s Time to initialize coeftab 3.742307e-01 s Time to factorize 7.772893e+00 s ( 2.74 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 4.461069e-02 s - iteration 1 : total iteration time 0.0235 s error 3.6647e-16 Time for refinement 7.880719e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) Start 2525: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrrtend 2187/3626 Test #2526: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtbegin ...***Timeout 320.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.282155e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.926832e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.028109e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.949094e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.491659e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 8.031060e-01 s Time to initialize coeftab 8.089740e-01 s Time to factorize 1.216151e+01 s ( 1.75 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 5.358872e-03 s - iteration 1 : total iteration time 0.0039 s error 4.5455e-09 - iteration 2 : total iteration time 0.00335 s error 5.4821e-10 - iteration 3 : total iteration time 0.0032 s error 3.7667e-13 Time for refinement 1.599845e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) Start 2526: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtbegin 2187/3626 Test #2527: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtend .....***Timeout 320.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.460114e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.628837e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.550524e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.258236e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.597228e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.414552e-01 s Time to initialize coeftab 1.228684e-01 s Time to factorize 5.586580e+00 s ( 3.81 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 9.228892e-02 s - iteration 1 : total iteration time 0.0993 s error 3.6647e-16 Time for refinement 3.215696e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) Start 2527: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtend 2187/3626 Test #2528: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpilu0 ...............***Timeout 319.89 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.623324e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.934514e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.619802e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.004113e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.727449e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.545314e-01 s Time to initialize coeftab 1.788852e-01 s Time to factorize 5.486100e+00 s ( 3.88 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.579867e-01 s - iteration 1 : total iteration time 0.341 s error 6.1304e-15 Time for refinement 8.688160e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.131900e-15 max(|| b_i - A x_i ||_1) 8.968767e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263124e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.131900e-15 max(|| b_i - A x_i ||_1) 8.968767e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263124e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.131900e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.131900e-15 max(|| b_i - A x_i ||_1) 8.968767e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263124e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 8.968767e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263124e-02 (SUCCESS) Start 2528: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpilu0 2187/3626 Test #2529: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpilu1 ...............***Timeout 313.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.322613e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.547638e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.541637e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.764666e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.541036e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.851377e-01 s Time to initialize coeftab 1.199912e-01 s Time to factorize 1.208060e+01 s ( 1.76 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 8.507766e-03 s - iteration 1 : total iteration time 0.00606 s error 6.1304e-15 Time for refinement 1.320049e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) Start 2529: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpilu1 2187/3626 Test #2530: mpi_dst_example_simple_lap_s_facto0_sched1_not_svdbegin .................***Timeout 312.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.321050e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.600834e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.899479e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.888653e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.570793e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.679079e-01 s Time to initialize coeftab 4.668890e-01 s Time to factorize 4.655996e+00 s ( 1.09 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 4.479962e-01 s Time for refinement 3.253929e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.073652e-07 max(|| b_i - A x_i ||_1) 9.548574e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.199865e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.073652e-07 max(|| b_i - A x_i ||_1) 9.548574e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.199865e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.073652e-07 max(|| b_i - A x_i ||_1) 9.548574e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.199865e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.073652e-07 max(|| b_i - A x_i ||_1) 9.548574e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.199865e+00 (SUCCESS) Start 2530: mpi_dst_example_simple_lap_s_facto0_sched1_not_svdbegin 2187/3626 Test #2531: mpi_dst_example_simple_lap_s_facto0_sched1_not_svdend ...................***Timeout 310.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.514096e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.404220e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.455572e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.471096e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.823152e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.299253e-01 s Time to initialize coeftab 7.007207e-01 s Time to factorize 1.346456e+00 s ( 3.76 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.079793e+00 s Time for refinement 1.687516e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.905192e-07 max(|| b_i - A x_i ||_1) 8.573656e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.077357e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.905192e-07 max(|| b_i - A x_i ||_1) 8.573656e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.077357e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.905192e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.905192e-07 max(|| b_i - A x_i ||_1) 8.573656e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.077357e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.573656e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.077357e+00 (SUCCESS) Start 2531: mpi_dst_example_simple_lap_s_facto0_sched1_not_svdend 2187/3626 Test #2532: mpi_dst_example_simple_lap_s_facto0_sched1_kway_svdbegin ................***Timeout 309.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.751620e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.000647e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.919797e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.151951e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.857364e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.174367e-01 s Time to initialize coeftab 6.791476e-01 s Time to factorize 5.838535e+00 s (887.86 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 4.814576e-01 s Time for refinement 7.226498e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.205373e-07 max(|| b_i - A x_i ||_1) 9.878543e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.241328e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.205373e-07 max(|| b_i - A x_i ||_1) 9.878543e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.241328e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.205373e-07 max(|| b_i - A x_i ||_1) 9.878543e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.241328e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.205373e-07 max(|| b_i - A x_i ||_1) 9.878543e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.241328e+00 (SUCCESS) Start 2532: mpi_dst_example_simple_lap_s_facto0_sched1_kway_svdbegin 2187/3626 Test #2533: mpi_dst_example_simple_lap_s_facto0_sched1_kway_svdend ..................***Timeout 307.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.560306e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.618875e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.616877e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.568395e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.745194e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.745948e-01 s Time to initialize coeftab 2.275530e-01 s Time to factorize 5.791613e+00 s (895.05 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.390039e-01 s Time for refinement 6.100014e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.928879e-07 max(|| b_i - A x_i ||_1) 8.642015e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.085947e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.928879e-07 max(|| b_i - A x_i ||_1) 8.642015e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.085947e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.928879e-07 max(|| b_i - A x_i ||_1) 8.642015e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.085947e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.928879e-07 max(|| b_i - A x_i ||_1) 8.642015e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.085947e+00 (SUCCESS) Start 2533: mpi_dst_example_simple_lap_s_facto0_sched1_kway_svdend 2187/3626 Test #2534: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_svdbegin .....***Timeout 306.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.458857e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.110038e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.280356e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.267008e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.639606e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.864705e-01 s Time to initialize coeftab 3.784385e-01 s Time to factorize 6.461788e+00 s (802.22 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.1 Ko / 44.3 Ko ------------------------------------------------ Total 68.3 Ko / 68.5 Ko Time to solve 2.610032e-01 s Time for refinement 3.400277e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.073203e-07 max(|| b_i - A x_i ||_1) 9.430597e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.185040e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.073203e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.073203e-07 max(|| b_i - A x_i ||_1) 9.430597e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.185040e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.430597e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.185040e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.073203e-07 max(|| b_i - A x_i ||_1) 9.430597e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.185040e+00 (SUCCESS) Start 2534: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_svdbegin 2187/3626 Test #2535: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_svdend .......***Timeout 305.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.517751e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.040654e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.596126e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.861506e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.732455e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.133182e-01 s Time to initialize coeftab 1.429158e-01 s Time to factorize 3.038908e+00 s ( 1.67 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.580021e-01 s Time for refinement 7.659987e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.869155e-07 max(|| b_i - A x_i ||_1) 8.504637e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.068685e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.869155e-07 max(|| b_i - A x_i ||_1) 8.504637e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.068685e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.869155e-07 max(|| b_i - A x_i ||_1) 8.504637e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.068685e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.869155e-07 max(|| b_i - A x_i ||_1) 8.504637e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.068685e+00 (SUCCESS) Start 2535: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_svdend 2187/3626 Test #2536: mpi_dst_example_simple_lap_s_facto0_sched1_not_pqrcpbegin ...............***Timeout 304.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.365034e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.004853e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.065955e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.320487e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.559462e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.658060e-01 s Time to initialize coeftab 2.264385e-01 s Time to factorize 5.039164e+00 s ( 1.00 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.151434e-01 s - iteration 1 : total iteration time 0.856 s error 3.7873e-11 Time for refinement 1.591457e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.216768e-08 max(|| b_i - A x_i ||_1) 3.024978e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.801158e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.216768e-08 max(|| b_i - A x_i ||_1) 3.024978e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.801158e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.216768e-08 max(|| b_i - A x_i ||_1) 3.024978e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.801158e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.216768e-08 max(|| b_i - A x_i ||_1) 3.024978e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.801158e-01 (SUCCESS) Start 2536: mpi_dst_example_simple_lap_s_facto0_sched1_not_pqrcpbegin 2187/3626 Test #2537: mpi_dst_example_simple_lap_s_facto0_sched1_not_pqrcpend .................***Timeout 303.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.748278e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.757907e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.389570e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.036878e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.954005e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.173529e-01 s Time to initialize coeftab 4.358650e-01 s Time to factorize 6.592091e-01 s ( 7.68 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.122615e+00 s Time for refinement 1.328350e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.690159e-07 max(|| b_i - A x_i ||_1) 1.185573e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.489779e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.690159e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.690159e-07 max(|| b_i - A x_i ||_1) 1.185573e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.489779e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.185573e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.489779e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.690159e-07 max(|| b_i - A x_i ||_1) 1.185573e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.489779e+00 (SUCCESS) Start 2537: mpi_dst_example_simple_lap_s_facto0_sched1_not_pqrcpend 2187/3626 Test #2538: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpbegin ..............***Timeout 300.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.717516e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.138945e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.249781e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.059407e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.835377e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.050001e-01 s Time to initialize coeftab 3.120079e-01 s Time to factorize 3.451008e+00 s ( 1.47 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.370007e-01 s - iteration 1 : total iteration time 0.828 s error 3.3268e-11 Time for refinement 1.915007e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.090352e-08 max(|| b_i - A x_i ||_1) 2.986382e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.752659e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.090352e-08 max(|| b_i - A x_i ||_1) 2.986382e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.752659e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.090352e-08 max(|| b_i - A x_i ||_1) 2.986382e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.752659e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.090352e-08 max(|| b_i - A x_i ||_1) 2.986382e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.752659e-01 (SUCCESS) Start 2538: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpbegin 2187/3626 Test #2539: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpend ................***Timeout 300.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.475161e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.343719e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.685158e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.922332e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.603513e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.121692e-01 s Time to initialize coeftab 2.269069e-01 s Time to factorize 2.097516e+00 s ( 2.41 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.432091e-01 s Time for refinement 7.150840e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.734489e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.734489e-07 max(|| b_i - A x_i ||_1) 1.281893e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.610814e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.734489e-07 max(|| b_i - A x_i ||_1) 1.281893e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.610814e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.281893e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.610814e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.734489e-07 max(|| b_i - A x_i ||_1) 1.281893e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.610814e+00 (SUCCESS) Start 2539: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpend 2187/3626 Test #2540: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpbegin ...***Timeout 295.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.616373e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.011465e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.586297e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.173431e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.808818e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.120650e-01 s Time to initialize coeftab 2.600868e-01 s Time to factorize 1.854967e+00 s ( 2.73 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.861302e-01 s - iteration 1 : total iteration time 1.62 s error 3.5522e-11 Time for refinement 3.091104e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.017353e-08 max(|| b_i - A x_i ||_1) 2.984747e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.750605e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.017353e-08 max(|| b_i - A x_i ||_1) 2.984747e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.750605e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.017353e-08 max(|| b_i - A x_i ||_1) 2.984747e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.750605e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.017353e-08 max(|| b_i - A x_i ||_1) 2.984747e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.750605e-01 (SUCCESS) Start 2540: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpbegin 2187/3626 Test #2541: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpend .....***Timeout 289.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.145718e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.263611e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.160190e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.282955e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.291995e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.546566e-01 s Time to initialize coeftab 1.917320e-01 s Time to factorize 1.894850e+00 s ( 2.67 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 6.896319e-01 s Time for refinement 8.617787e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.756710e-07 max(|| b_i - A x_i ||_1) 1.286426e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.616510e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.756710e-07 max(|| b_i - A x_i ||_1) 1.286426e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.616510e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.756710e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.756710e-07 max(|| b_i - A x_i ||_1) 1.286426e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.616510e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.286426e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.616510e+00 (SUCCESS) Start 2541: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpend 2187/3626 Test #2542: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrcpbegin ...............***Timeout 288.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.431405e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.446133e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.884356e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.204990e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.612454e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.915917e-01 s Time to initialize coeftab 4.880549e-01 s Time to factorize 5.280959e+00 s (981.60 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 4.989340e-01 s - iteration 1 : total iteration time 0.413 s error 4.2792e-11 Time for refinement 1.332602e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.033460e-08 max(|| b_i - A x_i ||_1) 2.971176e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.733551e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.033460e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.033460e-08 max(|| b_i - A x_i ||_1) 2.971176e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.733551e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.033460e-08 max(|| b_i - A x_i ||_1) 2.971176e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.733551e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.971176e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.733551e-01 (SUCCESS) Start 2542: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrcpbegin 2187/3626 Test #2543: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrcpend .................***Timeout 287.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.618435e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.860792e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.139912e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.114030e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.741800e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.520347e-01 s Time to initialize coeftab 1.850831e-01 s Time to factorize 1.200992e+00 s ( 4.22 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.008040e+00 s Time for refinement 6.053443e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.067039e-07 max(|| b_i - A x_i ||_1) 1.267320e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.592502e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.067039e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.067039e-07 max(|| b_i - A x_i ||_1) 1.267320e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.592502e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.267320e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.592502e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.067039e-07 max(|| b_i - A x_i ||_1) 1.267320e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.592502e+00 (SUCCESS) Start 2543: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrcpend 2187/3626 Test #2544: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrcpbegin ..............***Timeout 283.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.814542e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.159073e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.242903e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.209328e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.999365e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.479378e-02 s Time to initialize coeftab 6.279861e-01 s Time to factorize 2.638968e+00 s ( 1.92 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 7.510034e-01 s - iteration 1 : total iteration time 1.95 s error 4.2223e-11 Time for refinement 3.373260e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.013747e-08 max(|| b_i - A x_i ||_1) 2.974759e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.738054e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.013747e-08 max(|| b_i - A x_i ||_1) 2.974759e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.738054e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.013747e-08 max(|| b_i - A x_i ||_1) 2.974759e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.738054e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.013747e-08 max(|| b_i - A x_i ||_1) 2.974759e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.738054e-01 (SUCCESS) Start 2544: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrcpbegin 2187/3626 Test #2545: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrcpend ................***Timeout 278.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.532548e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.967593e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.451522e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.955159e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.636110e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.798963e-01 s Time to initialize coeftab 1.153020e-01 s Time to factorize 1.779551e+00 s ( 2.84 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.000209e-01 s Time for refinement 6.221788e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.327485e-07 max(|| b_i - A x_i ||_1) 9.721398e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.221581e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.327485e-07 max(|| b_i - A x_i ||_1) 9.721398e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.221581e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.327485e-07 max(|| b_i - A x_i ||_1) 9.721398e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.221581e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.327485e-07 max(|| b_i - A x_i ||_1) 9.721398e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.221581e+00 (SUCCESS) Start 2545: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrcpend 2187/3626 Test #2546: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpbegin ...***Timeout 272.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.493571e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.170056e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.580836e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.863783e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.734039e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.341317e-01 s Time to initialize coeftab 4.781165e-01 s Time to factorize 3.061281e+00 s ( 1.65 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 5.519686e-01 s - iteration 1 : total iteration time 1.04 s error 4.2656e-11 Time for refinement 2.544622e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.244860e-08 max(|| b_i - A x_i ||_1) 3.090109e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.883002e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.244860e-08 max(|| b_i - A x_i ||_1) 3.090109e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.883002e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.244860e-08 max(|| b_i - A x_i ||_1) 3.090109e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.883002e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.244860e-08 max(|| b_i - A x_i ||_1) 3.090109e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.883002e-01 (SUCCESS) Start 2546: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpbegin 2187/3626 Test #2547: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpend .....***Timeout 270.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.320610e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.990409e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.194579e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.093044e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.555778e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.790544e-01 s Time to initialize coeftab 9.980766e-02 s Time to factorize 1.477876e+00 s ( 3.43 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 6.049970e-01 s Time for refinement 8.998154e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.342550e-07 max(|| b_i - A x_i ||_1) 1.408206e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.769539e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.342550e-07 max(|| b_i - A x_i ||_1) 1.408206e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.769539e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.342550e-07 max(|| b_i - A x_i ||_1) 1.408206e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.769539e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.342550e-07 max(|| b_i - A x_i ||_1) 1.408206e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.769539e+00 (SUCCESS) Start 2547: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpend 2187/3626 Test #2548: mpi_dst_example_simple_lap_s_facto0_sched1_not_tqrcpbegin ...............***Timeout 253.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.416518e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.941891e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.453145e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.269922e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.662671e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.359155e-01 s Time to initialize coeftab 4.405813e-01 s Time to factorize 3.649076e+00 s ( 1.39 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 2.961757e-01 s - iteration 1 : total iteration time 0.907 s error 4.4749e-11 Time for refinement 1.756884e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.018805e-08 max(|| b_i - A x_i ||_1) 2.957321e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.716141e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.018805e-08 max(|| b_i - A x_i ||_1) 2.957321e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.716141e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.018805e-08 max(|| b_i - A x_i ||_1) 2.957321e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.716141e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.018805e-08 max(|| b_i - A x_i ||_1) 2.957321e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.716141e-01 (SUCCESS) Start 2548: mpi_dst_example_simple_lap_s_facto0_sched1_not_tqrcpbegin 2187/3626 Test #2549: mpi_dst_example_simple_lap_s_facto0_sched1_not_tqrcpend .................***Timeout 253.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.380670e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.993376e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.638540e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.221987e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.622328e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.937388e-01 s Time to initialize coeftab 1.364618e-01 s Time to factorize 1.955446e+00 s ( 2.59 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.793374e-01 s Time for refinement 8.130717e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.091306e-07 max(|| b_i - A x_i ||_1) 1.301129e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.634986e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.091306e-07 max(|| b_i - A x_i ||_1) 1.301129e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.634986e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.091306e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.091306e-07 max(|| b_i - A x_i ||_1) 1.301129e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.634986e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.301129e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.634986e+00 (SUCCESS) Start 2549: mpi_dst_example_simple_lap_s_facto0_sched1_not_tqrcpend 2187/3626 Test #2550: mpi_dst_example_simple_lap_s_facto0_sched1_kway_tqrcpbegin ..............***Timeout 252.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.451493e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.777952e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.388602e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.033385e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.621216e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.240358e-01 s Time to initialize coeftab 5.038537e-01 s Time to factorize 3.824914e+00 s ( 1.32 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 4.726123e-01 s - iteration 1 : total iteration time 0.687 s error 2.7481e-11 Time for refinement 1.891656e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.116704e-08 max(|| b_i - A x_i ||_1) 3.010090e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.782450e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.116704e-08 max(|| b_i - A x_i ||_1) 3.010090e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.782450e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.116704e-08 max(|| b_i - A x_i ||_1) 3.010090e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.782450e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.116704e-08 max(|| b_i - A x_i ||_1) 3.010090e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.782450e-01 (SUCCESS) Start 2550: mpi_dst_example_simple_lap_s_facto0_sched1_kway_tqrcpbegin 2187/3626 Test #2551: mpi_dst_example_simple_lap_s_facto0_sched1_kway_tqrcpend ................***Timeout 251.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.668059e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.217195e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.483748e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.432716e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.971901e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.604263e-01 s Time to initialize coeftab 5.894413e-01 s Time to factorize 1.720881e+00 s ( 2.94 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.592207e-01 s Time for refinement 1.082307e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.332192e-07 max(|| b_i - A x_i ||_1) 9.837855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.236215e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.332192e-07 max(|| b_i - A x_i ||_1) 9.837855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.236215e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.332192e-07 max(|| b_i - A x_i ||_1) 9.837855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.236215e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.332192e-07 max(|| b_i - A x_i ||_1) 9.837855e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.236215e+00 (SUCCESS) Start 2551: mpi_dst_example_simple_lap_s_facto0_sched1_kway_tqrcpend 2187/3626 Test #2552: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpbegin ...***Timeout 251.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.273319e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.216709e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.172492e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.136881e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.447170e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.390250e-01 s Time to initialize coeftab 4.968734e-01 s Time to factorize 5.673509e+00 s (913.68 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 4.137885e-01 s - iteration 1 : total iteration time 0.351 s error 4.0569e-11 Time for refinement 1.224966e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.903407e-08 max(|| b_i - A x_i ||_1) 2.941828e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.696673e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.903407e-08 max(|| b_i - A x_i ||_1) 2.941828e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.696673e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.903407e-08 max(|| b_i - A x_i ||_1) 2.941828e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.696673e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.903407e-08 max(|| b_i - A x_i ||_1) 2.941828e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.696673e-01 (SUCCESS) Start 2552: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpbegin 2187/3626 Test #2553: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpend .....***Timeout 251.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.107322e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.931616e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.992445e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.879125e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.329286e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.189083e-01 s Time to initialize coeftab 1.179642e-01 s Time to factorize 3.027998e+00 s ( 1.67 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.459999e-01 s Time for refinement 6.460331e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.645165e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.645165e-07 max(|| b_i - A x_i ||_1) 1.144272e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.437881e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.645165e-07 max(|| b_i - A x_i ||_1) 1.144272e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.437881e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.144272e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.437881e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.645165e-07 max(|| b_i - A x_i ||_1) 1.144272e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.437881e+00 (SUCCESS) Start 2553: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpend 2187/3626 Test #2554: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrrtbegin ...............***Timeout 249.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.098463e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.707696e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.455957e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.174649e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.341025e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.052798e-01 s Time to initialize coeftab 3.910709e-01 s Time to factorize 2.408988e+00 s ( 2.10 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.414103e-01 s - iteration 1 : total iteration time 1.09 s error 4.1557e-11 Time for refinement 2.165989e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.111538e-08 max(|| b_i - A x_i ||_1) 3.007764e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.779528e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.111538e-08 max(|| b_i - A x_i ||_1) 3.007764e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.779528e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.111538e-08 max(|| b_i - A x_i ||_1) 3.007764e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.779528e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.111538e-08 max(|| b_i - A x_i ||_1) 3.007764e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.779528e-01 (SUCCESS) Start 2554: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrrtbegin 2187/3626 Test #2555: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrrtend .................***Timeout 248.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.604015e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.257961e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.956161e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.209899e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.814478e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.365043e-01 s Time to initialize coeftab 3.163844e-01 s Time to factorize 1.488629e+00 s ( 3.40 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.150723e-01 s Time for refinement 7.790695e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.682586e-07 max(|| b_i - A x_i ||_1) 1.177933e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.480180e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.682586e-07 max(|| b_i - A x_i ||_1) 1.177933e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.480180e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.682586e-07 max(|| b_i - A x_i ||_1) 1.177933e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.480180e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.682586e-07 max(|| b_i - A x_i ||_1) 1.177933e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.480180e+00 (SUCCESS) Start 2555: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrrtend 2187/3626 Test #2556: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrrtbegin ..............***Timeout 244.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.363632e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.559199e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.176851e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.055216e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.542023e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.792801e-01 s Time to initialize coeftab 2.452783e-01 s Time to factorize 2.771910e+00 s ( 1.83 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 4.918893e-01 s - iteration 1 : total iteration time 0.876 s error 3.4514e-11 Time for refinement 1.927642e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.013839e-08 max(|| b_i - A x_i ||_1) 2.981040e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.745947e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.013839e-08 max(|| b_i - A x_i ||_1) 2.981040e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.745947e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.013839e-08 max(|| b_i - A x_i ||_1) 2.981040e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.745947e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.013839e-08 max(|| b_i - A x_i ||_1) 2.981040e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.745947e-01 (SUCCESS) Start 2556: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrrtbegin 2187/3626 Test #2557: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrrtend ................***Timeout 244.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.415874e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.260039e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.903421e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.409042e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.639899e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.071547e-01 s Time to initialize coeftab 1.165414e-01 s Time to factorize 1.599479e+00 s ( 3.16 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.132994e-01 s Time for refinement 9.050505e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.746154e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.746154e-07 max(|| b_i - A x_i ||_1) 1.218270e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.530867e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.218270e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.530867e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.746154e-07 max(|| b_i - A x_i ||_1) 1.218270e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.530867e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.746154e-07 max(|| b_i - A x_i ||_1) 1.218270e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.530867e+00 (SUCCESS) Start 2557: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrrtend 2187/3626 Test #2558: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtbegin ...***Timeout 241.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.201351e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.929382e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.512634e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.880236e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.447245e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.729697e-01 s Time to initialize coeftab 4.260676e-01 s Time to factorize 3.992004e+00 s ( 1.27 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.287391e-01 s - iteration 1 : total iteration time 0.413 s error 3.792e-11 Time for refinement 1.156950e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.977856e-08 max(|| b_i - A x_i ||_1) 2.995482e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.764094e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.977856e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.977856e-08 max(|| b_i - A x_i ||_1) 2.995482e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.764094e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.995482e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.764094e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.977856e-08 max(|| b_i - A x_i ||_1) 2.995482e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.764094e-01 (SUCCESS) Start 2558: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtbegin 2187/3626 Test #2559: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtend .....***Timeout 241.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.082984e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.142108e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.317087e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.580775e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.217769e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.309788e-01 s Time to initialize coeftab 3.373348e-01 s Time to factorize 2.732505e+00 s ( 1.85 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.143050e-01 s Time for refinement 8.611498e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.100134e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.100134e-07 max(|| b_i - A x_i ||_1) 9.022123e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.133711e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.100134e-07 max(|| b_i - A x_i ||_1) 9.022123e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.133711e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.022123e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.133711e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.100134e-07 max(|| b_i - A x_i ||_1) 9.022123e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.133711e+00 (SUCCESS) Start 2559: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtend 2187/3626 Test #2560: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpilu0 ...............***Timeout 240.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.411221e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.661914e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.518429e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.318444e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.607725e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.612463e-01 s Time to initialize coeftab 4.970740e-01 s Time to factorize 2.453138e+00 s ( 2.06 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.354014e-01 s - iteration 1 : total iteration time 1.34 s error 2.7902e-11 Time for refinement 3.046294e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.025074e-08 max(|| b_i - A x_i ||_1) 2.945604e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.701417e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.025074e-08 max(|| b_i - A x_i ||_1) 2.945604e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.701417e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.025074e-08 max(|| b_i - A x_i ||_1) 2.945604e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.701417e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.025074e-08 max(|| b_i - A x_i ||_1) 2.945604e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.701417e-01 (SUCCESS) Start 2560: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpilu0 2187/3626 Test #2561: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpilu1 ...............***Timeout 240.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.572925e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.855953e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.567649e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.681664e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.913945e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.590274e-01 s Time to initialize coeftab 7.480192e-01 s Time to factorize 1.721743e+00 s ( 2.94 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.093199e-01 s - iteration 1 : total iteration time 1.79 s error 3.523e-11 Time for refinement 3.434755e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.064848e-08 max(|| b_i - A x_i ||_1) 2.987991e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.754681e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.064848e-08 max(|| b_i - A x_i ||_1) 2.987991e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.754681e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.064848e-08 max(|| b_i - A x_i ||_1) 2.987991e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.754681e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.064848e-08 max(|| b_i - A x_i ||_1) 2.987991e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.754681e-01 (SUCCESS) Start 2561: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpilu1 2187/3626 Test #2562: mpi_dst_example_simple_lap_s_facto1_sched1_not_svdbegin .................***Timeout 237.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.470155e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.905957e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.698780e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.607238e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.603424e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.535427e-01 s Time to initialize coeftab 4.238273e-01 s Time to factorize 6.263830e+00 s (855.56 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 6.563182e-01 s Time for refinement 6.316352e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.957907e-07 max(|| b_i - A x_i ||_1) 8.643843e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.086177e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.957907e-07 max(|| b_i - A x_i ||_1) 8.643843e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.086177e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.957907e-07 max(|| b_i - A x_i ||_1) 8.643843e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.086177e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.957907e-07 max(|| b_i - A x_i ||_1) 8.643843e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.086177e+00 (SUCCESS) Start 2562: mpi_dst_example_simple_lap_s_facto1_sched1_not_svdbegin 2187/3626 Test #2563: mpi_dst_example_simple_lap_s_facto1_sched1_not_svdend ...................***Timeout 235.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.935916e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.169503e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.380733e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.353584e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.094830e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.841141e-01 s Time to initialize coeftab 5.489990e-01 s Time to factorize 2.316755e+00 s ( 2.26 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 6.385100e-01 s Time for refinement 5.911501e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.716120e-07 max(|| b_i - A x_i ||_1) 7.564191e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.505090e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.716120e-07 max(|| b_i - A x_i ||_1) 7.564191e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.505090e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.716120e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.716120e-07 max(|| b_i - A x_i ||_1) 7.564191e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.505090e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.564191e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.505090e-01 (SUCCESS) Start 2563: mpi_dst_example_simple_lap_s_facto1_sched1_not_svdend 2187/3626 Test #2564: mpi_dst_example_simple_lap_s_facto1_sched1_kway_svdbegin ................***Timeout 234.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.011466e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.939240e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.980460e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.226554e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.197250e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.683546e-01 s Time to initialize coeftab 5.888170e-01 s Time to factorize 5.336000e+00 s (1004.32 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 4.181421e-01 s Time for refinement 4.297022e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.936004e-07 max(|| b_i - A x_i ||_1) 8.486789e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.066442e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.936004e-07 max(|| b_i - A x_i ||_1) 8.486789e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.066442e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.936004e-07 max(|| b_i - A x_i ||_1) 8.486789e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.066442e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.936004e-07 max(|| b_i - A x_i ||_1) 8.486789e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.066442e+00 (SUCCESS) Start 2564: mpi_dst_example_simple_lap_s_facto1_sched1_kway_svdbegin 2187/3626 Test #2565: mpi_dst_example_simple_lap_s_facto1_sched1_kway_svdend ..................***Timeout 234.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.685119e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.324975e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.076479e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.401609e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.854376e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.451340e-01 s Time to initialize coeftab 1.495927e-01 s Time to factorize 1.172866e+00 s ( 4.46 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.874163e-01 s Time for refinement 9.624379e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.733885e-07 max(|| b_i - A x_i ||_1) 7.635753e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.595013e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.733885e-07 max(|| b_i - A x_i ||_1) 7.635753e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.595013e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.733885e-07 max(|| b_i - A x_i ||_1) 7.635753e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.595013e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.733885e-07 max(|| b_i - A x_i ||_1) 7.635753e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.595013e-01 (SUCCESS) Start 2565: mpi_dst_example_simple_lap_s_facto1_sched1_kway_svdend 2187/3626 Test #2566: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_svdbegin .....***Timeout 232.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.067253e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.474279e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.214401e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.255837e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.232668e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.653388e-01 s Time to initialize coeftab 8.384842e-01 s Time to factorize 4.324476e+00 s ( 1.21 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.486328e-01 s Time for refinement 7.136193e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.840185e-07 max(|| b_i - A x_i ||_1) 8.325092e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.046123e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.840185e-07 max(|| b_i - A x_i ||_1) 8.325092e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.046123e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.840185e-07 max(|| b_i - A x_i ||_1) 8.325092e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.046123e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.840185e-07 max(|| b_i - A x_i ||_1) 8.325092e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.046123e+00 (SUCCESS) Start 2566: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_svdbegin 2187/3626 Test #2567: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_svdend .......***Timeout 228.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.249943e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.390874e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.731774e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.504729e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.506801e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.226063e-01 s Time to initialize coeftab 1.049345e-01 s Time to factorize 2.896057e+00 s ( 1.81 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 6.676135e-01 s Time for refinement 9.181446e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.752312e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.752312e-07 max(|| b_i - A x_i ||_1) 7.673982e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.643052e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.752312e-07 max(|| b_i - A x_i ||_1) 7.673982e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.643052e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.752312e-07 max(|| b_i - A x_i ||_1) 7.673982e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.643052e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.673982e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.643052e-01 (SUCCESS) Start 2567: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_svdend 2187/3626 Test #2568: mpi_dst_example_simple_lap_s_facto1_sched1_not_pqrcpbegin ...............***Timeout 224.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.383336e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.789395e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.970420e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.140222e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.524287e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.341146e-01 s Time to initialize coeftab 3.034282e-01 s Time to factorize 3.021411e+00 s ( 1.73 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 6.092156e-01 s - iteration 1 : total iteration time 0.602 s error 5.4903e-11 Time for refinement 2.130510e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.651358e-08 max(|| b_i - A x_i ||_1) 2.777365e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.490010e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.651358e-08 max(|| b_i - A x_i ||_1) 2.777365e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.490010e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.651358e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.651358e-08 max(|| b_i - A x_i ||_1) 2.777365e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.490010e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.777365e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.490010e-01 (SUCCESS) Start 2568: mpi_dst_example_simple_lap_s_facto1_sched1_not_pqrcpbegin 2187/3626 Test #2569: mpi_dst_example_simple_lap_s_facto1_sched1_not_pqrcpend .................***Timeout 224.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.745927e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.673337e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.368439e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.483426e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.012655e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.899826e-01 s Time to initialize coeftab 5.670478e-01 s Time to factorize 7.261808e-01 s ( 7.21 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.245168e-01 s Time for refinement 8.399421e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.991361e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.991361e-07 max(|| b_i - A x_i ||_1) 1.163391e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.461905e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.163391e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.461905e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.991361e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.991361e-07 max(|| b_i - A x_i ||_1) 1.163391e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.461905e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.163391e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.461905e+00 (SUCCESS) Start 2569: mpi_dst_example_simple_lap_s_facto1_sched1_not_pqrcpend 2187/3626 Test #2570: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpbegin ..............***Timeout 223.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.492218e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.549576e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.005045e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.689553e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.787844e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.390626e-01 s Time to initialize coeftab 4.103221e-01 s Time to factorize 9.498092e-01 s ( 5.51 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.773981e-01 s - iteration 1 : total iteration time 1.67 s error 5.5017e-11 Time for refinement 3.270031e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.571809e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.571809e-08 max(|| b_i - A x_i ||_1) 2.788650e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.504191e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.788650e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.504191e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.571809e-08 max(|| b_i - A x_i ||_1) 2.788650e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.504191e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.571809e-08 max(|| b_i - A x_i ||_1) 2.788650e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.504191e-01 (SUCCESS) Start 2570: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpbegin 2187/3626 Test #2571: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpend ................***Timeout 223.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.159261e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.790117e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.430880e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.424504e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.348614e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.695723e-01 s Time to initialize coeftab 1.477806e-01 s Time to factorize 1.766124e+00 s ( 2.96 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.528252e-01 s Time for refinement 6.303247e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.673030e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.673030e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.673030e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.673030e-07 max(|| b_i - A x_i ||_1) 1.089616e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.369200e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.089616e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.369200e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.089616e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.369200e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.089616e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.369200e+00 (SUCCESS) Start 2571: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpend 2187/3626 Test #2572: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpbegin ...***Timeout 223.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 3: 200 660 2: 200 760 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.597154e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.248229e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.677018e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.857718e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.708453e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.592905e-01 s Time to initialize coeftab 1.923808e-01 s Time to factorize 1.109003e+00 s ( 4.72 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.290034e-01 s - iteration 1 : total iteration time 1.23 s error 4.976e-11 Time for refinement 2.524999e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.960700e-08 max(|| b_i - A x_i ||_1) 2.940117e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.694523e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.960700e-08 max(|| b_i - A x_i ||_1) 2.940117e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.694523e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.960700e-08 max(|| b_i - A x_i ||_1) 2.940117e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.694523e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.960700e-08 max(|| b_i - A x_i ||_1) 2.940117e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.694523e-01 (SUCCESS) Start 2572: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpbegin 2187/3626 Test #2573: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpend .....***Timeout 222.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.255873e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.535621e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.233650e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.784508e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.391393e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.877005e-01 s Time to initialize coeftab 1.109676e-01 s Time to factorize 7.698697e-01 s ( 6.80 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.357831e-01 s Time for refinement 7.803982e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.585129e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.585129e-07 max(|| b_i - A x_i ||_1) 1.333883e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.676144e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.333883e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.676144e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.585129e-07 max(|| b_i - A x_i ||_1) 1.333883e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.676144e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.585129e-07 max(|| b_i - A x_i ||_1) 1.333883e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.676144e+00 (SUCCESS) Start 2573: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpend 2187/3626 Test #2574: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrcpbegin ...............***Timeout 222.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.840146e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.772845e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.350964e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.631676e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.077657e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.327191e-01 s Time to initialize coeftab 2.068028e-01 s Time to factorize 2.950549e+00 s ( 1.77 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 5.515195e-01 s - iteration 1 : total iteration time 0.506 s error 5.5047e-11 Time for refinement 1.470071e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.751878e-08 max(|| b_i - A x_i ||_1) 2.863531e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.598285e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.751878e-08 max(|| b_i - A x_i ||_1) 2.863531e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.598285e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.751878e-08 max(|| b_i - A x_i ||_1) 2.863531e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.598285e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.751878e-08 max(|| b_i - A x_i ||_1) 2.863531e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.598285e-01 (SUCCESS) Start 2574: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrcpbegin 2187/3626 Test #2575: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrcpend .................***Timeout 222.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.113515e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.305347e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.394628e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.427979e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.256733e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.399548e-01 s Time to initialize coeftab 6.505386e-02 s Time to factorize 1.773420e+00 s ( 2.95 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.830117e-01 s Time for refinement 4.427973e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.702868e-07 max(|| b_i - A x_i ||_1) 1.096015e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.377242e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.702868e-07 max(|| b_i - A x_i ||_1) 1.096015e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.377242e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.702868e-07 max(|| b_i - A x_i ||_1) 1.096015e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.377242e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.702868e-07 max(|| b_i - A x_i ||_1) 1.096015e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.377242e+00 (SUCCESS) Start 2575: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrcpend Test #2172: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrrtbegin ..............***Timeout 221.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.106320e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.346587e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.360614e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.614861e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.159632e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.794353e-01 s Time to initialize coeftab 5.978429e-01 s Time to factorize 9.938302e+00 s (539.23 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.844590e-02 s - iteration 1 : total iteration time 0.132 s error 6.4174e-13 Time for refinement 2.616385e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417417e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417417e-13 max(|| b_i - A x_i ||_1) 1.272697e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599254e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.272697e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599254e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417417e-13 max(|| b_i - A x_i ||_1) 1.272697e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599254e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.417417e-13 max(|| b_i - A x_i ||_1) 1.272697e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.599254e+00 (SUCCESS) 2188/3626 Test #2576: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrcpbegin ..............***Timeout 220.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.375772e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.178745e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.991643e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.711387e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.518330e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.107968e-01 s Time to initialize coeftab 5.931175e-01 s Time to factorize 3.151678e+00 s ( 1.66 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 3.627746e-01 s - iteration 1 : total iteration time 1.01 s error 5.4724e-11 Time for refinement 1.897003e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.705219e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.705219e-08 max(|| b_i - A x_i ||_1) 2.852655e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.584619e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.852655e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.584619e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.705219e-08 max(|| b_i - A x_i ||_1) 2.852655e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.584619e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.705219e-08 max(|| b_i - A x_i ||_1) 2.852655e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.584619e-01 (SUCCESS) Start 2576: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrcpbegin 2188/3626 Test #2577: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrcpend ................***Timeout 219.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.065656e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.191214e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.661103e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.579073e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.275335e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.458404e-01 s Time to initialize coeftab 1.200118e-01 s Time to factorize 2.773214e+00 s ( 1.89 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.707754e-01 s Time for refinement 5.534957e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.545411e-07 max(|| b_i - A x_i ||_1) 1.323195e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.662714e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.545411e-07 max(|| b_i - A x_i ||_1) 1.323195e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.662714e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.545411e-07 max(|| b_i - A x_i ||_1) 1.323195e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.662714e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.545411e-07 max(|| b_i - A x_i ||_1) 1.323195e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.662714e+00 (SUCCESS) Start 2577: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrcpend Start 2611: mpi_dst_example_simple_lap_s_facto2_sched1_not_tqrcpbegin Start 2612: mpi_dst_example_simple_lap_s_facto2_sched1_not_tqrcpend Start 2613: mpi_dst_example_simple_lap_s_facto2_sched1_kway_tqrcpbegin Start 2614: mpi_dst_example_simple_lap_s_facto2_sched1_kway_tqrcpend Start 2615: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpbegin Start 2616: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpend Start 2617: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrrtbegin Start 2618: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrrtend Start 2619: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrrtbegin Start 2620: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrrtend Start 2621: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtbegin Start 2622: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtend Start 2623: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpilu0 Start 2624: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpilu1 Start 2625: mpi_dst_example_simple_lap_d_facto0_sched1_not_svdbegin Start 2626: mpi_dst_example_simple_lap_d_facto0_sched1_not_svdend Start 2627: mpi_dst_example_simple_lap_d_facto0_sched1_kway_svdbegin Start 2628: mpi_dst_example_simple_lap_d_facto0_sched1_kway_svdend Start 2629: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_svdbegin Start 2630: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_svdend Start 2631: mpi_dst_example_simple_lap_d_facto0_sched1_not_pqrcpbegin Start 2632: mpi_dst_example_simple_lap_d_facto0_sched1_not_pqrcpend Start 2633: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpbegin Start 2634: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpend Start 2635: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_pqrcpbegin Start 2636: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_pqrcpend Start 2637: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrcpbegin Start 2638: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrcpend Start 2639: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrcpbegin Start 2640: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrcpend Start 2641: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpbegin Start 2642: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpend Start 2643: mpi_dst_example_simple_lap_d_facto0_sched1_not_tqrcpbegin Start 2644: mpi_dst_example_simple_lap_d_facto0_sched1_not_tqrcpend Start 2645: mpi_dst_example_simple_lap_d_facto0_sched1_kway_tqrcpbegin Start 2646: mpi_dst_example_simple_lap_d_facto0_sched1_kway_tqrcpend Start 2647: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpbegin Start 2648: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpend Start 2649: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrrtbegin Start 2650: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrrtend Start 2651: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrrtbegin Start 2652: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrrtend Start 2653: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtbegin Start 2654: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtend Start 2655: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpilu0 Start 2656: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpilu1 Start 2657: mpi_dst_example_simple_lap_d_facto1_sched1_not_svdbegin Start 2658: mpi_dst_example_simple_lap_d_facto1_sched1_not_svdend Start 2659: mpi_dst_example_simple_lap_d_facto1_sched1_kway_svdbegin Start 2660: mpi_dst_example_simple_lap_d_facto1_sched1_kway_svdend Start 2661: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_svdbegin Start 2662: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_svdend Start 2663: mpi_dst_example_simple_lap_d_facto1_sched1_not_pqrcpbegin Start 2664: mpi_dst_example_simple_lap_d_facto1_sched1_not_pqrcpend Start 2665: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpbegin Start 2666: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpend Start 2667: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpbegin Start 2668: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpend Start 2669: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrcpbegin Start 2670: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrcpend Start 2671: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrcpbegin Start 2672: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrcpend Start 2673: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpbegin Start 2674: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpend Start 2675: mpi_dst_example_simple_lap_d_facto1_sched1_not_tqrcpbegin Start 2676: mpi_dst_example_simple_lap_d_facto1_sched1_not_tqrcpend Start 2677: mpi_dst_example_simple_lap_d_facto1_sched1_kway_tqrcpbegin Start 2678: mpi_dst_example_simple_lap_d_facto1_sched1_kway_tqrcpend Start 2679: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpbegin Start 2680: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpend Start 2681: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrrtbegin Start 2682: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrrtend Start 2683: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrrtbegin Start 2684: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrrtend Start 2685: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtbegin Start 2686: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtend Start 2687: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpilu0 Start 2688: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpilu1 Start 2689: mpi_dst_example_simple_lap_d_facto2_sched1_not_svdbegin Start 2690: mpi_dst_example_simple_lap_d_facto2_sched1_not_svdend Start 2691: mpi_dst_example_simple_lap_d_facto2_sched1_kway_svdbegin Start 2692: mpi_dst_example_simple_lap_d_facto2_sched1_kway_svdend Start 2693: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_svdbegin Start 2694: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_svdend Start 2695: mpi_dst_example_simple_lap_d_facto2_sched1_not_pqrcpbegin Start 2696: mpi_dst_example_simple_lap_d_facto2_sched1_not_pqrcpend Start 2697: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpbegin Start 2698: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpend Start 2699: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpbegin Test #2173: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrrtend ................ Passed 90.04 sec Test #2193: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrcpend ................ Passed 60.46 sec Test #2219: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpend ................ Passed 36.91 sec Test #2223: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrcpend ................. Passed 33.86 sec Start 2700: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpend Start 2701: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrcpbegin Start 2702: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrcpend Start 2703: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrcpbegin Test #2296: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpbegin ...***Timeout 23.47 sec Test #2499: mpi_dst_example_simple_lap_z_facto4_sched0_not_svdend ...................***Timeout 20.61 sec Start 2499: mpi_dst_example_simple_lap_z_facto4_sched0_not_svdend 2193/3626 Test #2578: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpbegin ...***Timeout 20.40 sec Start 2578: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpbegin 2193/3626 Test #2579: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpend .....***Timeout 20.41 sec Start 2579: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpend 2193/3626 Test #2580: mpi_dst_example_simple_lap_s_facto1_sched1_not_tqrcpbegin ...............***Timeout 20.41 sec Start 2580: mpi_dst_example_simple_lap_s_facto1_sched1_not_tqrcpbegin 2193/3626 Test #2581: mpi_dst_example_simple_lap_s_facto1_sched1_not_tqrcpend .................***Timeout 20.41 sec Start 2581: mpi_dst_example_simple_lap_s_facto1_sched1_not_tqrcpend 2193/3626 Test #2582: mpi_dst_example_simple_lap_s_facto1_sched1_kway_tqrcpbegin ..............***Timeout 20.41 sec Start 2582: mpi_dst_example_simple_lap_s_facto1_sched1_kway_tqrcpbegin 2193/3626 Test #2583: mpi_dst_example_simple_lap_s_facto1_sched1_kway_tqrcpend ................***Timeout 20.41 sec Start 2583: mpi_dst_example_simple_lap_s_facto1_sched1_kway_tqrcpend 2193/3626 Test #2584: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpbegin ...***Timeout 20.41 sec Start 2584: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpbegin 2193/3626 Test #2585: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpend .....***Timeout 20.41 sec Start 2585: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpend 2193/3626 Test #2586: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrrtbegin ...............***Timeout 20.42 sec Start 2586: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrrtbegin 2193/3626 Test #2587: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrrtend .................***Timeout 20.42 sec Start 2587: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrrtend 2193/3626 Test #2588: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrrtbegin ..............***Timeout 20.43 sec Start 2588: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrrtbegin 2193/3626 Test #2589: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrrtend ................***Timeout 20.43 sec Start 2589: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrrtend 2193/3626 Test #2590: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtbegin ...***Timeout 20.43 sec Start 2590: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtbegin 2193/3626 Test #2591: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpilu0 ...............***Timeout 20.44 sec Start 2591: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpilu0 2193/3626 Test #2592: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpilu1 ...............***Timeout 20.44 sec Start 2592: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpilu1 2193/3626 Test #2593: mpi_dst_example_simple_lap_s_facto2_sched1_not_svdbegin .................***Timeout 20.44 sec Start 2593: mpi_dst_example_simple_lap_s_facto2_sched1_not_svdbegin 2193/3626 Test #2594: mpi_dst_example_simple_lap_s_facto2_sched1_not_svdend ...................***Timeout 20.44 sec Start 2594: mpi_dst_example_simple_lap_s_facto2_sched1_not_svdend 2193/3626 Test #2595: mpi_dst_example_simple_lap_s_facto2_sched1_kway_svdbegin ................***Timeout 20.45 sec Start 2595: mpi_dst_example_simple_lap_s_facto2_sched1_kway_svdbegin 2193/3626 Test #2596: mpi_dst_example_simple_lap_s_facto2_sched1_kway_svdend ..................***Timeout 20.45 sec Start 2596: mpi_dst_example_simple_lap_s_facto2_sched1_kway_svdend 2193/3626 Test #2597: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_svdbegin .....***Timeout 20.45 sec Start 2597: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_svdbegin 2193/3626 Test #2598: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_svdend .......***Timeout 20.45 sec Start 2598: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_svdend 2193/3626 Test #2599: mpi_dst_example_simple_lap_s_facto2_sched1_not_pqrcpbegin ...............***Timeout 20.46 sec Start 2599: mpi_dst_example_simple_lap_s_facto2_sched1_not_pqrcpbegin 2193/3626 Test #2600: mpi_dst_example_simple_lap_s_facto2_sched1_not_pqrcpend .................***Timeout 20.47 sec Start 2600: mpi_dst_example_simple_lap_s_facto2_sched1_not_pqrcpend 2193/3626 Test #2601: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpbegin ..............***Timeout 20.48 sec Start 2601: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpbegin 2193/3626 Test #2602: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpend ................***Timeout 20.48 sec Start 2602: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpend 2193/3626 Test #2603: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpbegin ...***Timeout 20.49 sec Start 2603: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpbegin 2193/3626 Test #2604: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpend .....***Timeout 20.50 sec Start 2604: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpend 2193/3626 Test #2605: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrcpbegin ...............***Timeout 20.51 sec Start 2605: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrcpbegin 2193/3626 Test #2606: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrcpend .................***Timeout 20.52 sec Start 2606: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrcpend 2193/3626 Test #2607: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrcpbegin ..............***Timeout 20.53 sec Start 2607: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrcpbegin 2193/3626 Test #2608: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrcpend ................***Timeout 20.54 sec Start 2608: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrcpend 2193/3626 Test #2609: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpbegin ...***Timeout 20.55 sec Start 2609: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpbegin 2193/3626 Test #2610: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpend .....***Timeout 20.56 sec Start 2610: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpend Start 2704: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrcpend Test #2143: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtend ..... Passed 65.44 sec Start 2705: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpbegin Test #2141: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrrtend ................ Passed 65.52 sec Start 2706: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpend Test #2144: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpilu0 ............... Passed 65.60 sec Start 2707: mpi_dst_example_simple_lap_d_facto2_sched1_not_tqrcpbegin Test #2158: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrcpbegin ............... Passed 67.81 sec Start 2708: mpi_dst_example_simple_lap_d_facto2_sched1_not_tqrcpend Test #2169: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpend ..... Passed 74.14 sec Start 2709: mpi_dst_example_simple_lap_d_facto2_sched1_kway_tqrcpbegin Test #2191: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrcpend ................. Passed 77.20 sec Start 2710: mpi_dst_example_simple_lap_d_facto2_sched1_kway_tqrcpend 2199/3626 Test #2635: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_pqrcpbegin ... Passed 75.80 sec Start 2711: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpbegin Test #2182: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_svdbegin ..... Passed 140.31 sec Start 2712: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpend Test #2224: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrcpbegin .............. Passed 198.50 sec Start 2713: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrrtbegin Test #2135: mpi_dst_example_simple_lap_d_facto0_sched0_kway_tqrcpend ................***Timeout 496.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2136: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpbegin ...***Timeout 496.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2137: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpend .....***Timeout 496.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2138: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrrtbegin ...............***Timeout 496.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2139: mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrrtend .................***Timeout 496.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2140: mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrrtbegin ..............***Timeout 496.47 sec ischedInit: The thread number has been automatically set to 256 Test #2142: mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtbegin ...***Timeout 496.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2145: mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpilu1 ...............***Timeout 496.45 sec Test #2146: mpi_dst_example_simple_lap_d_facto1_sched0_not_svdbegin .................***Timeout 496.45 sec Test #2147: mpi_dst_example_simple_lap_d_facto1_sched0_not_svdend ...................***Timeout 496.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2148: mpi_dst_example_simple_lap_d_facto1_sched0_kway_svdbegin ................***Timeout 496.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2150: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_svdbegin .....***Timeout 496.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2151: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_svdend .......***Timeout 496.43 sec Test #2152: mpi_dst_example_simple_lap_d_facto1_sched0_not_pqrcpbegin ...............***Timeout 496.42 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2153: mpi_dst_example_simple_lap_d_facto1_sched0_not_pqrcpend .................***Timeout 496.42 sec Test #2155: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpend ................***Timeout 496.42 sec ischedInit: The thread number has been automatically set to 256 Test #2157: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpend .....***Timeout 496.41 sec Test #2159: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrcpend .................***Timeout 496.40 sec Test #2160: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrcpbegin ..............***Timeout 496.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2161: mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrcpend ................***Timeout 496.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2163: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpend .....***Timeout 496.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2164: mpi_dst_example_simple_lap_d_facto1_sched0_not_tqrcpbegin ...............***Timeout 496.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2165: mpi_dst_example_simple_lap_d_facto1_sched0_not_tqrcpend .................***Timeout 496.38 sec ischedInit: The thread number has been automatically set to 256 Test #2166: mpi_dst_example_simple_lap_d_facto1_sched0_kway_tqrcpbegin ..............***Timeout 496.38 sec Test #2167: mpi_dst_example_simple_lap_d_facto1_sched0_kway_tqrcpend ................***Timeout 496.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2168: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpbegin ...***Timeout 496.37 sec Test #2171: mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrrtend .................***Timeout 496.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2174: mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtbegin ...***Timeout 496.36 sec ischedInit: The thread number has been automatically set to 256 Test #2176: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpilu0 ...............***Timeout 496.35 sec Test #2177: mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpilu1 ...............***Timeout 496.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2178: mpi_dst_example_simple_lap_d_facto2_sched0_not_svdbegin .................***Timeout 496.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2179: mpi_dst_example_simple_lap_d_facto2_sched0_not_svdend ...................***Timeout 496.33 sec ischedInit: The thread number has been automatically set to 256 Test #2183: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_svdend .......***Timeout 496.32 sec Test #2184: mpi_dst_example_simple_lap_d_facto2_sched0_not_pqrcpbegin ...............***Timeout 496.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2185: mpi_dst_example_simple_lap_d_facto2_sched0_not_pqrcpend .................***Timeout 496.31 sec Test #2186: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpbegin ..............***Timeout 496.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2187: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpend ................***Timeout 496.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2188: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpbegin ...***Timeout 496.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2189: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpend .....***Timeout 496.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2190: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrcpbegin ...............***Timeout 496.29 sec Test #2192: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrcpbegin ..............***Timeout 496.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2194: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpbegin ...***Timeout 496.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2195: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpend .....***Timeout 496.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2196: mpi_dst_example_simple_lap_d_facto2_sched0_not_tqrcpbegin ...............***Timeout 496.27 sec ischedInit: The thread number has been automatically set to 256 Test #2197: mpi_dst_example_simple_lap_d_facto2_sched0_not_tqrcpend .................***Timeout 496.26 sec Test #2198: mpi_dst_example_simple_lap_d_facto2_sched0_kway_tqrcpbegin ..............***Timeout 496.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2199: mpi_dst_example_simple_lap_d_facto2_sched0_kway_tqrcpend ................***Timeout 496.25 sec Test #2200: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpbegin ...***Timeout 496.25 sec ischedInit: The thread number has been automatically set to 256 Test #2201: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpend .....***Timeout 496.24 sec Test #2202: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrrtbegin ...............***Timeout 496.24 sec ischedInit: The thread number has been automatically set to 256 Test #2203: mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrrtend .................***Timeout 496.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2204: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrrtbegin ..............***Timeout 496.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2205: mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrrtend ................***Timeout 496.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2206: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtbegin ...***Timeout 496.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2207: mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtend .....***Timeout 496.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2208: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpilu0 ...............***Timeout 496.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2209: mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpilu1 ...............***Timeout 496.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2210: mpi_dst_example_simple_lap_c_facto0_sched0_not_svdbegin .................***Timeout 496.20 sec ischedInit: The thread number has been automatically set to 256 Test #2211: mpi_dst_example_simple_lap_c_facto0_sched0_not_svdend ...................***Timeout 496.19 sec Test #2212: mpi_dst_example_simple_lap_c_facto0_sched0_kway_svdbegin ................***Timeout 496.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2213: mpi_dst_example_simple_lap_c_facto0_sched0_kway_svdend ..................***Timeout 496.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2214: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_svdbegin .....***Timeout 496.18 sec ischedInit: The thread number has been automatically set to 256 Test #2215: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_svdend .......***Timeout 496.18 sec Test #2216: mpi_dst_example_simple_lap_c_facto0_sched0_not_pqrcpbegin ...............***Timeout 496.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2217: mpi_dst_example_simple_lap_c_facto0_sched0_not_pqrcpend .................***Timeout 496.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2218: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpbegin ..............***Timeout 496.16 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2220: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpbegin ...***Timeout 496.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2221: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpend .....***Timeout 496.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2222: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrcpbegin ...............***Timeout 496.15 sec Test #2225: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrcpend ................***Timeout 496.14 sec Test #2226: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpbegin ...***Timeout 496.13 sec Test #2227: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpend .....***Timeout 496.13 sec Test #2228: mpi_dst_example_simple_lap_c_facto0_sched0_not_tqrcpbegin ...............***Timeout 496.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2229: mpi_dst_example_simple_lap_c_facto0_sched0_not_tqrcpend .................***Timeout 496.12 sec Test #2230: mpi_dst_example_simple_lap_c_facto0_sched0_kway_tqrcpbegin ..............***Timeout 496.12 sec ischedInit: The thread number has been automatically set to 256 Test #2231: mpi_dst_example_simple_lap_c_facto0_sched0_kway_tqrcpend ................***Timeout 496.11 sec Test #2233: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpend .....***Timeout 496.11 sec Test #2234: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrrtbegin ...............***Timeout 496.10 sec Test #2235: mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrrtend .................***Timeout 496.10 sec Test #2236: mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrrtbegin ..............***Timeout 496.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2238: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtbegin ...***Timeout 496.09 sec ischedInit: The thread number has been automatically set to 256 Test #2239: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtend .....***Timeout 496.08 sec ischedInit: The thread number has been automatically set to 256 Test #2240: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpilu0 ...............***Timeout 496.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2241: mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpilu1 ...............***Timeout 496.07 sec ischedInit: The thread number has been automatically set to 256 Test #2243: mpi_dst_example_simple_lap_c_facto1_sched0_not_svdend ...................***Timeout 496.07 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2245: mpi_dst_example_simple_lap_c_facto1_sched0_kway_svdend ..................***Timeout 496.06 sec ischedInit: The thread number has been automatically set to 256 Test #2246: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_svdbegin .....***Timeout 496.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2247: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_svdend .......***Timeout 496.06 sec ischedInit: The thread number has been automatically set to 256 Test #2248: mpi_dst_example_simple_lap_c_facto1_sched0_not_pqrcpbegin ...............***Timeout 496.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2249: mpi_dst_example_simple_lap_c_facto1_sched0_not_pqrcpend .................***Timeout 496.05 sec Test #2250: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpbegin ..............***Timeout 496.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2251: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpend ................***Timeout 496.04 sec ischedInit: The thread number has been automatically set to 256 Test #2252: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpbegin ...***Timeout 496.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2253: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpend .....***Timeout 496.03 sec ischedInit: The thread number has been automatically set to 256 Test #2254: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrcpbegin ...............***Timeout 496.02 sec ischedInit: The thread number has been automatically set to 256 Test #2255: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrcpend .................***Timeout 496.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2256: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrcpbegin ..............***Timeout 496.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2257: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrcpend ................***Timeout 496.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2258: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpbegin ...***Timeout 496.01 sec ischedInit: The thread number has been automatically set to 256 Test #2259: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpend .....***Timeout 496.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2260: mpi_dst_example_simple_lap_c_facto1_sched0_not_tqrcpbegin ...............***Timeout 496.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2261: mpi_dst_example_simple_lap_c_facto1_sched0_not_tqrcpend .................***Timeout 495.99 sec ischedInit: The thread number has been automatically set to 256 Test #2263: mpi_dst_example_simple_lap_c_facto1_sched0_kway_tqrcpend ................***Timeout 495.99 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2264: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpbegin ...***Timeout 495.98 sec Test #2265: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpend .....***Timeout 495.98 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2266: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrrtbegin ...............***Timeout 495.97 sec Test #2267: mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrrtend .................***Timeout 495.97 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2269: mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrrtend ................***Timeout 495.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2270: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtbegin ...***Timeout 495.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2271: mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtend .....***Timeout 495.95 sec ischedInit: The thread number has been automatically set to 256 Test #2272: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpilu0 ...............***Timeout 495.95 sec Test #2273: mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpilu1 ...............***Timeout 495.95 sec ischedInit: The thread number has been automatically set to 256 Test #2274: mpi_dst_example_simple_lap_c_facto2_sched0_not_svdbegin .................***Timeout 495.94 sec ischedInit: The thread number has been automatically set to 256 Test #2276: mpi_dst_example_simple_lap_c_facto2_sched0_kway_svdbegin ................***Timeout 495.93 sec ischedInit: The thread number has been automatically set to 256 Test #2277: mpi_dst_example_simple_lap_c_facto2_sched0_kway_svdend ..................***Timeout 495.93 sec ischedInit: The thread number has been automatically set to 256 Test #2278: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_svdbegin .....***Timeout 495.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.671460e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.006285e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.399643e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.320329e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.935142e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.384159e-01 s Time to initialize coeftab 7.799762e-01 s Time to factorize 4.317168e+01 s (948.06 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #2279: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_svdend .......***Timeout 495.92 sec Test #2280: mpi_dst_example_simple_lap_c_facto2_sched0_not_pqrcpbegin ...............***Timeout 495.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2281: mpi_dst_example_simple_lap_c_facto2_sched0_not_pqrcpend .................***Timeout 495.91 sec Test #2282: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpbegin ..............***Timeout 495.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2283: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpend ................***Timeout 495.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2284: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpbegin ...***Timeout 495.90 sec ischedInit: The thread number has been automatically set to 256 Test #2285: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpend .....***Timeout 495.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2286: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrcpbegin ...............***Timeout 495.89 sec Test #2287: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrcpend .................***Timeout 495.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2288: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrcpbegin ..............***Timeout 495.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2289: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrcpend ................***Timeout 495.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2290: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpbegin ...***Timeout 495.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2291: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpend .....***Timeout 495.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2292: mpi_dst_example_simple_lap_c_facto2_sched0_not_tqrcpbegin ...............***Timeout 495.86 sec Test #2293: mpi_dst_example_simple_lap_c_facto2_sched0_not_tqrcpend .................***Timeout 495.85 sec ischedInit: The thread number has been automatically set to 256 Test #2294: mpi_dst_example_simple_lap_c_facto2_sched0_kway_tqrcpbegin ..............***Timeout 495.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2295: mpi_dst_example_simple_lap_c_facto2_sched0_kway_tqrcpend ................***Timeout 495.84 sec Test #2297: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpend .....***Timeout 495.84 sec Test #2298: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrrtbegin ...............***Timeout 495.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2299: mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrrtend .................***Timeout 495.83 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2300: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrrtbegin ..............***Timeout 495.82 sec Test #2301: mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrrtend ................***Timeout 495.82 sec Test #2302: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtbegin ...***Timeout 495.82 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2303: mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtend .....***Timeout 495.81 sec Test #2304: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpilu0 ...............***Timeout 495.81 sec Test #2305: mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpilu1 ...............***Timeout 495.80 sec Test #2307: mpi_dst_example_simple_lap_c_facto3_sched0_not_svdend ...................***Timeout 495.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2309: mpi_dst_example_simple_lap_c_facto3_sched0_kway_svdend ..................***Timeout 495.79 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2311: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_svdend .......***Timeout 495.79 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2312: mpi_dst_example_simple_lap_c_facto3_sched0_not_pqrcpbegin ...............***Timeout 495.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2313: mpi_dst_example_simple_lap_c_facto3_sched0_not_pqrcpend .................***Timeout 495.78 sec Test #2315: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpend ................***Timeout 495.77 sec ischedInit: The thread number has been automatically set to 256 Test #2316: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpbegin ...***Timeout 495.77 sec Test #2317: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpend .....***Timeout 495.77 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2318: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrcpbegin ...............***Timeout 495.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2319: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrcpend .................***Timeout 495.76 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2320: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrcpbegin ..............***Timeout 495.75 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2321: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrcpend ................***Timeout 495.75 sec Test #2322: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpbegin ...***Timeout 495.74 sec Test #2323: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpend .....***Timeout 495.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2324: mpi_dst_example_simple_lap_c_facto3_sched0_not_tqrcpbegin ...............***Timeout 495.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2325: mpi_dst_example_simple_lap_c_facto3_sched0_not_tqrcpend .................***Timeout 495.73 sec Test #2326: mpi_dst_example_simple_lap_c_facto3_sched0_kway_tqrcpbegin ..............***Timeout 495.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2327: mpi_dst_example_simple_lap_c_facto3_sched0_kway_tqrcpend ................***Timeout 495.72 sec ischedInit: The thread number has been automatically set to 256 Test #2328: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpbegin ...***Timeout 495.71 sec Test #2329: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpend .....***Timeout 495.71 sec Test #2330: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrrtbegin ...............***Timeout 495.71 sec ischedInit: The thread number has been automatically set to 256 Test #2331: mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrrtend .................***Timeout 495.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2332: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrrtbegin ..............***Timeout 495.70 sec ischedInit: The thread number has been automatically set to 256 Test #2333: mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrrtend ................***Timeout 495.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2334: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtbegin ...***Timeout 495.69 sec ischedInit: The thread number has been automatically set to 256 Test #2335: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtend .....***Timeout 495.68 sec ischedInit: The thread number has been automatically set to 256 Test #2336: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpilu0 ...............***Timeout 495.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2337: mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpilu1 ...............***Timeout 495.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2339: mpi_dst_example_simple_lap_c_facto4_sched0_not_svdend ...................***Timeout 495.67 sec Test #2340: mpi_dst_example_simple_lap_c_facto4_sched0_kway_svdbegin ................***Timeout 495.66 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2341: mpi_dst_example_simple_lap_c_facto4_sched0_kway_svdend ..................***Timeout 495.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2343: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_svdend .......***Timeout 495.65 sec Test #2344: mpi_dst_example_simple_lap_c_facto4_sched0_not_pqrcpbegin ...............***Timeout 495.65 sec ischedInit: The thread number has been automatically set to 256 Test #2346: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpbegin ..............***Timeout 495.64 sec ischedInit: The thread number has been automatically set to 256 Test #2347: mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpend ................***Timeout 495.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2348: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpbegin ...***Timeout 495.63 sec ischedInit: The thread number has been automatically set to 256 Test #2349: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpend .....***Timeout 495.63 sec ischedInit: The thread number has been automatically set to 256 Test #2388: mpi_dst_example_simple_lap_z_facto0_sched0_not_tqrcpbegin ...............***Timeout 495.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2388: mpi_dst_example_simple_lap_z_facto0_sched0_not_tqrcpbegin Test #2389: mpi_dst_example_simple_lap_z_facto0_sched0_not_tqrcpend .................***Timeout 495.61 sec Start 2389: mpi_dst_example_simple_lap_z_facto0_sched0_not_tqrcpend Test #2390: mpi_dst_example_simple_lap_z_facto0_sched0_kway_tqrcpbegin ..............***Timeout 495.61 sec Start 2390: mpi_dst_example_simple_lap_z_facto0_sched0_kway_tqrcpbegin Test #2391: mpi_dst_example_simple_lap_z_facto0_sched0_kway_tqrcpend ................***Timeout 495.61 sec ischedInit: The thread number has been automatically set to 256 Start 2391: mpi_dst_example_simple_lap_z_facto0_sched0_kway_tqrcpend Test #2392: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpbegin ...***Timeout 495.61 sec Start 2392: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpbegin Test #2393: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpend .....***Timeout 495.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2393: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpend Test #2394: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrrtbegin ...............***Timeout 495.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2394: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrrtbegin Test #2395: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrrtend .................***Timeout 495.61 sec Start 2395: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrrtend Test #2396: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrrtbegin ..............***Timeout 495.61 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2396: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrrtbegin Test #2397: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrrtend ................***Timeout 495.61 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2397: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrrtend Test #2398: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtbegin ...***Timeout 495.61 sec Start 2398: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtbegin Test #2399: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtend .....***Timeout 495.61 sec Start 2399: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtend Test #2400: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpilu0 ...............***Timeout 495.61 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2400: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpilu0 Test #2401: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpilu1 ...............***Timeout 495.61 sec ischedInit: The thread number has been automatically set to 256 Start 2401: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpilu1 Test #2402: mpi_dst_example_simple_lap_z_facto1_sched0_not_svdbegin .................***Timeout 495.61 sec ischedInit: The thread number has been automatically set to 256 Start 2402: mpi_dst_example_simple_lap_z_facto1_sched0_not_svdbegin Test #2403: mpi_dst_example_simple_lap_z_facto1_sched0_not_svdend ...................***Timeout 495.61 sec ischedInit: The thread number has been automatically set to 256 Start 2403: mpi_dst_example_simple_lap_z_facto1_sched0_not_svdend Test #2404: mpi_dst_example_simple_lap_z_facto1_sched0_kway_svdbegin ................***Timeout 495.61 sec Start 2404: mpi_dst_example_simple_lap_z_facto1_sched0_kway_svdbegin Test #2405: mpi_dst_example_simple_lap_z_facto1_sched0_kway_svdend ..................***Timeout 495.61 sec Start 2405: mpi_dst_example_simple_lap_z_facto1_sched0_kway_svdend Test #2406: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_svdbegin .....***Timeout 495.61 sec Start 2406: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_svdbegin Test #2407: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_svdend .......***Timeout 495.60 sec Start 2407: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_svdend Test #2408: mpi_dst_example_simple_lap_z_facto1_sched0_not_pqrcpbegin ...............***Timeout 495.60 sec Start 2408: mpi_dst_example_simple_lap_z_facto1_sched0_not_pqrcpbegin Test #2409: mpi_dst_example_simple_lap_z_facto1_sched0_not_pqrcpend .................***Timeout 495.60 sec ischedInit: The thread number has been automatically set to 256 Start 2409: mpi_dst_example_simple_lap_z_facto1_sched0_not_pqrcpend Test #2410: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpbegin ..............***Timeout 495.60 sec Start 2410: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpbegin Test #2411: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpend ................***Timeout 495.60 sec ischedInit: The thread number has been automatically set to 256 Start 2411: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpend Test #2412: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpbegin ...***Timeout 495.60 sec Start 2412: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpbegin Test #2413: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpend .....***Timeout 495.60 sec Start 2413: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpend Test #2414: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrcpbegin ...............***Timeout 495.60 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2414: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrcpbegin Test #2415: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrcpend .................***Timeout 495.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2415: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrcpend Test #2416: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrcpbegin ..............***Timeout 495.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2416: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrcpbegin Test #2417: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrcpend ................***Timeout 495.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2417: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrcpend Test #2418: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpbegin ...***Timeout 495.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2418: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpbegin Test #2419: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpend .....***Timeout 495.59 sec Start 2419: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpend Test #2420: mpi_dst_example_simple_lap_z_facto1_sched0_not_tqrcpbegin ...............***Timeout 495.59 sec Start 2420: mpi_dst_example_simple_lap_z_facto1_sched0_not_tqrcpbegin Test #2421: mpi_dst_example_simple_lap_z_facto1_sched0_not_tqrcpend .................***Timeout 495.59 sec ischedInit: The thread number has been automatically set to 256 Start 2421: mpi_dst_example_simple_lap_z_facto1_sched0_not_tqrcpend Test #2422: mpi_dst_example_simple_lap_z_facto1_sched0_kway_tqrcpbegin ..............***Timeout 495.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 [arch-nspawn-3655178:2165659] *** Process received signal *** [arch-nspawn-3655178:2165659] Signal: Segmentation fault (11) [arch-nspawn-3655178:2165659] Signal code: Address not mapped (1) [arch-nspawn-3655178:2165659] Failing at address: 0x7ff84d0a1860 [arch-nspawn-3655178:2165659] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7f3ef71876cc] [arch-nspawn-3655178:2165659] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7f3ef509aa02] [arch-nspawn-3655178:2165659] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7f3ef509b504] [arch-nspawn-3655178:2165659] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7f3ef504ca7a] [arch-nspawn-3655178:2165659] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7f3ef5079aa2] [arch-nspawn-3655178:2165659] [ 5] /usr/lib/libmpi.so.40(+0x7de1a) [0x7f3ef587de1a] [arch-nspawn-3655178:2165659] [ 6] /usr/lib/libmpi.so.40(ompi_request_default_wait+0x1a) [0x7f3ef588019c] [arch-nspawn-3655178:2165659] [ 7] /usr/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0x98) [0x7f3ef58f03e8] [arch-nspawn-3655178:2165659] [ 8] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_recursivedoubling+0x210) [0x7f3ef58f1a88] [arch-nspawn-3655178:2165659] [ 9] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_ring+0x3fc) [0x7f3ef58f443c] [arch-nspawn-3655178:2165659] [10] /usr/lib/libmpi.so.40(ompi_coll_tuned_allreduce_intra_dec_fixed+0x40) [0x7f3ef5915152] [arch-nspawn-3655178:2165659] [11] /usr/lib/libmpi.so.40(MPI_Allreduce+0x294) [0x7f3ef588e584] [arch-nspawn-3655178:2165659] [12] /build/pastix/src/build/spm/src/libspm.so.1(spmUpdateComputedFields+0x140) [0x7f3ef5bdb458] [arch-nspawn-3655178:2165659] [13] /build/pastix/src/build/spm/src/libspm.so.1(genLaplacian+0xaa) [0x7f3ef5be421e] [arch-nspawn-3655178:2165659] [14] /build/pastix/src/build/spm/src/libspm.so.1(+0x409c8) [0x7f3ef5be59c8] [arch-nspawn-3655178:2165659] [15] ./simple(+0xe2c) [0x555555556e2c] [arch-nspawn-3655178:2165659] [16] /usr/lib/libc.so.6(+0x27fae) [0x7f3ef56a4fae] [arch-nspawn-3655178:2165659] [17] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7f3ef56a50b8] [arch-nspawn-3655178:2165659] [18] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:2165659] *** End of error message *** Start 2422: mpi_dst_example_simple_lap_z_facto1_sched0_kway_tqrcpbegin Test #2423: mpi_dst_example_simple_lap_z_facto1_sched0_kway_tqrcpend ................***Timeout 495.58 sec Start 2423: mpi_dst_example_simple_lap_z_facto1_sched0_kway_tqrcpend Test #2424: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpbegin ...***Timeout 495.58 sec Start 2424: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpbegin Test #2425: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpend .....***Timeout 495.58 sec Start 2425: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpend Test #2427: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrrtend .................***Timeout 495.58 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2427: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrrtend Test #2428: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrrtbegin ..............***Timeout 495.58 sec Start 2428: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrrtbegin Test #2429: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrrtend ................***Timeout 495.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2429: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrrtend Test #2430: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtbegin ...***Timeout 495.58 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2430: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtbegin Test #2431: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtend .....***Timeout 495.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2431: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtend Test #2432: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpilu0 ...............***Timeout 495.57 sec Start 2432: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpilu0 Test #2433: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpilu1 ...............***Timeout 495.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2433: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpilu1 Test #2434: mpi_dst_example_simple_lap_z_facto2_sched0_not_svdbegin .................***Timeout 495.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2434: mpi_dst_example_simple_lap_z_facto2_sched0_not_svdbegin Test #2435: mpi_dst_example_simple_lap_z_facto2_sched0_not_svdend ...................***Timeout 495.57 sec ischedInit: The thread number has been automatically set to 256 Start 2435: mpi_dst_example_simple_lap_z_facto2_sched0_not_svdend Test #2436: mpi_dst_example_simple_lap_z_facto2_sched0_kway_svdbegin ................***Timeout 495.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2436: mpi_dst_example_simple_lap_z_facto2_sched0_kway_svdbegin Test #2438: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_svdbegin .....***Timeout 495.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2438: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_svdbegin Test #2439: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_svdend .......***Timeout 495.56 sec Start 2439: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_svdend Test #2440: mpi_dst_example_simple_lap_z_facto2_sched0_not_pqrcpbegin ...............***Timeout 495.56 sec Start 2440: mpi_dst_example_simple_lap_z_facto2_sched0_not_pqrcpbegin Test #2441: mpi_dst_example_simple_lap_z_facto2_sched0_not_pqrcpend .................***Timeout 495.56 sec ischedInit: The thread number has been automatically set to 256 Start 2441: mpi_dst_example_simple_lap_z_facto2_sched0_not_pqrcpend Test #2442: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpbegin ..............***Timeout 495.56 sec Start 2442: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpbegin Test #2443: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpend ................***Timeout 495.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2443: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpend Test #2444: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpbegin ...***Timeout 495.56 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2444: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpbegin Test #2445: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpend .....***Timeout 495.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2445: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpend Test #2446: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrcpbegin ...............***Timeout 495.55 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2446: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrcpbegin Test #2447: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrcpend .................***Timeout 495.55 sec ischedInit: The thread number has been automatically set to 256 Start 2447: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrcpend Test #2448: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrcpbegin ..............***Timeout 495.55 sec ischedInit: The thread number has been automatically set to 256 Start 2448: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrcpbegin Test #2449: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrcpend ................***Timeout 495.55 sec Start 2449: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrcpend Test #2450: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpbegin ...***Timeout 495.55 sec Start 2450: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpbegin Test #2451: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpend .....***Timeout 495.55 sec Start 2451: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpend Test #2452: mpi_dst_example_simple_lap_z_facto2_sched0_not_tqrcpbegin ...............***Timeout 495.55 sec ischedInit: The thread number has been automatically set to 256 Start 2452: mpi_dst_example_simple_lap_z_facto2_sched0_not_tqrcpbegin Test #2453: mpi_dst_example_simple_lap_z_facto2_sched0_not_tqrcpend .................***Timeout 495.55 sec ischedInit: The thread number has been automatically set to 256 Start 2453: mpi_dst_example_simple_lap_z_facto2_sched0_not_tqrcpend Test #2454: mpi_dst_example_simple_lap_z_facto2_sched0_kway_tqrcpbegin ..............***Timeout 495.54 sec Start 2454: mpi_dst_example_simple_lap_z_facto2_sched0_kway_tqrcpbegin Test #2455: mpi_dst_example_simple_lap_z_facto2_sched0_kway_tqrcpend ................***Timeout 495.54 sec Start 2455: mpi_dst_example_simple_lap_z_facto2_sched0_kway_tqrcpend Test #2456: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpbegin ...***Timeout 495.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2456: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpbegin Test #2457: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpend .....***Timeout 495.54 sec Start 2457: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpend Test #2458: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrrtbegin ...............***Timeout 495.54 sec Start 2458: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrrtbegin Test #2459: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrrtend .................***Timeout 495.54 sec Start 2459: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrrtend Test #2460: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrrtbegin ..............***Timeout 495.54 sec ischedInit: The thread number has been automatically set to 256 Start 2460: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrrtbegin Test #2461: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrrtend ................***Timeout 495.54 sec ischedInit: The thread number has been automatically set to 256 Start 2461: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrrtend Test #2462: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtbegin ...***Timeout 495.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2462: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtbegin Test #2463: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtend .....***Timeout 495.54 sec Start 2463: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtend Test #2464: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpilu0 ...............***Timeout 495.54 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2464: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpilu0 Test #2465: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpilu1 ...............***Timeout 495.54 sec ischedInit: The thread number has been automatically set to 256 Start 2465: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpilu1 Test #2466: mpi_dst_example_simple_lap_z_facto3_sched0_not_svdbegin .................***Timeout 495.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2466: mpi_dst_example_simple_lap_z_facto3_sched0_not_svdbegin Test #2467: mpi_dst_example_simple_lap_z_facto3_sched0_not_svdend ...................***Timeout 495.54 sec Start 2467: mpi_dst_example_simple_lap_z_facto3_sched0_not_svdend Test #2468: mpi_dst_example_simple_lap_z_facto3_sched0_kway_svdbegin ................***Timeout 495.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2468: mpi_dst_example_simple_lap_z_facto3_sched0_kway_svdbegin Test #2469: mpi_dst_example_simple_lap_z_facto3_sched0_kway_svdend ..................***Timeout 495.54 sec Start 2469: mpi_dst_example_simple_lap_z_facto3_sched0_kway_svdend Test #2470: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_svdbegin .....***Timeout 495.55 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2470: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_svdbegin Test #2471: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_svdend .......***Timeout 495.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2471: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_svdend Test #2472: mpi_dst_example_simple_lap_z_facto3_sched0_not_pqrcpbegin ...............***Timeout 495.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2472: mpi_dst_example_simple_lap_z_facto3_sched0_not_pqrcpbegin Test #2473: mpi_dst_example_simple_lap_z_facto3_sched0_not_pqrcpend .................***Timeout 495.55 sec Start 2473: mpi_dst_example_simple_lap_z_facto3_sched0_not_pqrcpend Test #2474: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpbegin ..............***Timeout 495.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2474: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpbegin Test #2475: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpend ................***Timeout 495.54 sec ischedInit: The thread number has been automatically set to 256 Start 2475: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpend Test #2476: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpbegin ...***Timeout 495.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2476: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpbegin Test #2477: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpend .....***Timeout 495.55 sec Start 2477: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpend Test #2356: mpi_dst_example_simple_lap_c_facto4_sched0_not_tqrcpbegin ...............***Timeout 495.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2232: mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpbegin ...***Timeout 495.54 sec Test #2242: mpi_dst_example_simple_lap_c_facto1_sched0_not_svdbegin .................***Timeout 495.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2244: mpi_dst_example_simple_lap_c_facto1_sched0_kway_svdbegin ................***Timeout 495.53 sec ischedInit: The thread number has been automatically set to 256 Test #2262: mpi_dst_example_simple_lap_c_facto1_sched0_kway_tqrcpbegin ..............***Timeout 495.52 sec Test #2306: mpi_dst_example_simple_lap_c_facto3_sched0_not_svdbegin .................***Timeout 495.52 sec Test #2308: mpi_dst_example_simple_lap_c_facto3_sched0_kway_svdbegin ................***Timeout 495.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2310: mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_svdbegin .....***Timeout 495.50 sec Test #2338: mpi_dst_example_simple_lap_c_facto4_sched0_not_svdbegin .................***Timeout 495.50 sec ischedInit: The thread number has been automatically set to 256 Test #2342: mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_svdbegin .....***Timeout 495.49 sec ischedInit: The thread number has been automatically set to 256 Test #2478: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrcpbegin ...............***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2478: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrcpbegin Test #2479: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrcpend .................***Timeout 495.48 sec Start 2479: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrcpend Test #2480: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrcpbegin ..............***Timeout 495.48 sec Start 2480: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrcpbegin Test #2481: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrcpend ................***Timeout 495.48 sec Start 2481: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrcpend Test #2482: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpbegin ...***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2482: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpbegin Test #2483: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpend .....***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2483: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpend Test #2484: mpi_dst_example_simple_lap_z_facto3_sched0_not_tqrcpbegin ...............***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 Start 2484: mpi_dst_example_simple_lap_z_facto3_sched0_not_tqrcpbegin Test #2485: mpi_dst_example_simple_lap_z_facto3_sched0_not_tqrcpend .................***Timeout 495.48 sec Start 2485: mpi_dst_example_simple_lap_z_facto3_sched0_not_tqrcpend Test #2486: mpi_dst_example_simple_lap_z_facto3_sched0_kway_tqrcpbegin ..............***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 Start 2486: mpi_dst_example_simple_lap_z_facto3_sched0_kway_tqrcpbegin Test #2487: mpi_dst_example_simple_lap_z_facto3_sched0_kway_tqrcpend ................***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 Start 2487: mpi_dst_example_simple_lap_z_facto3_sched0_kway_tqrcpend Test #2488: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpbegin ...***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 Start 2488: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpbegin Test #2489: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpend .....***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 Start 2489: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpend Test #2490: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrrtbegin ...............***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 Start 2490: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrrtbegin Test #2491: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrrtend .................***Timeout 495.48 sec Start 2491: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrrtend Test #2492: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrrtbegin ..............***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 Start 2492: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrrtbegin Test #2493: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrrtend ................***Timeout 495.48 sec Start 2493: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrrtend Test #2494: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtbegin ...***Timeout 495.48 sec Start 2494: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtbegin Test #2495: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtend .....***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2495: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtend Test #2496: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpilu0 ...............***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2496: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpilu0 Test #2497: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpilu1 ...............***Timeout 495.48 sec Start 2497: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpilu1 Test #2498: mpi_dst_example_simple_lap_z_facto4_sched0_not_svdbegin .................***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 Start 2498: mpi_dst_example_simple_lap_z_facto4_sched0_not_svdbegin Test #2500: mpi_dst_example_simple_lap_z_facto4_sched0_kway_svdbegin ................***Timeout 495.48 sec Start 2500: mpi_dst_example_simple_lap_z_facto4_sched0_kway_svdbegin Test #2501: mpi_dst_example_simple_lap_z_facto4_sched0_kway_svdend ..................***Timeout 495.48 sec Start 2501: mpi_dst_example_simple_lap_z_facto4_sched0_kway_svdend Test #2502: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_svdbegin .....***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2502: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_svdbegin Test #2503: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_svdend .......***Timeout 495.48 sec Start 2503: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_svdend Test #2504: mpi_dst_example_simple_lap_z_facto4_sched0_not_pqrcpbegin ...............***Timeout 495.48 sec Start 2504: mpi_dst_example_simple_lap_z_facto4_sched0_not_pqrcpbegin Test #2505: mpi_dst_example_simple_lap_z_facto4_sched0_not_pqrcpend .................***Timeout 495.48 sec Start 2505: mpi_dst_example_simple_lap_z_facto4_sched0_not_pqrcpend Test #2506: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpbegin ..............***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2506: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpbegin Test #2507: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpend ................***Timeout 495.47 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2507: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpend Test #2508: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpbegin ...***Timeout 495.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2508: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpbegin Test #2509: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpend .....***Timeout 495.47 sec ischedInit: The thread number has been automatically set to 256 Start 2509: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpend Test #2510: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrcpbegin ...............***Timeout 495.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2510: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrcpbegin Test #2511: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrcpend .................***Timeout 495.48 sec Start 2511: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrcpend Test #2512: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrcpbegin ..............***Timeout 495.48 sec Start 2512: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrcpbegin Test #2513: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrcpend ................***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2513: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrcpend Test #2514: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpbegin ...***Timeout 495.48 sec Start 2514: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpbegin Test #2515: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpend .....***Timeout 495.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2515: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpend Test #2516: mpi_dst_example_simple_lap_z_facto4_sched0_not_tqrcpbegin ...............***Timeout 495.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.733052e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.831673e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.372322e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.045304e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.013604e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.229438e-02 s Time to initialize coeftab 9.278761e-01 s Time to factorize 7.737182e+00 s ( 2.75 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 8.937364e-02 s - iteration 1 : total iteration time 0.0982 s error 1.5512e-14 Time for refinement 2.060501e-01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) Start 2516: mpi_dst_example_simple_lap_z_facto4_sched0_not_tqrcpbegin Test #2517: mpi_dst_example_simple_lap_z_facto4_sched0_not_tqrcpend .................***Timeout 495.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.114114e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.518515e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.223565e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.986680e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.329871e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.118027e-01 s Time to initialize coeftab 1.492844e-01 s Time to factorize 8.246779e+00 s ( 2.58 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.489438e-02 s - iteration 1 : total iteration time 0.0149 s error 3.419e-16 Time for refinement 3.088600e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) Start 2517: mpi_dst_example_simple_lap_z_facto4_sched0_not_tqrcpend Test #2518: mpi_dst_example_simple_lap_z_facto4_sched0_kway_tqrcpbegin ..............***Timeout 495.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.925956e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.354081e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.221736e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.001615e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.039183e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.100079e-01 s Time to initialize coeftab 1.587919e+00 s Time to factorize 1.090088e+01 s ( 1.95 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.927527e-02 s - iteration 1 : total iteration time 0.0191 s error 1.5512e-14 Time for refinement 5.909178e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) Start 2518: mpi_dst_example_simple_lap_z_facto4_sched0_kway_tqrcpbegin Test #2519: mpi_dst_example_simple_lap_z_facto4_sched0_kway_tqrcpend ................***Timeout 495.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.549197e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.213176e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.424824e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.016472e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.662214e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.281273e-01 s Time to initialize coeftab 3.662021e-01 s Time to factorize 3.694957e+00 s ( 5.77 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.015456e-01 s - iteration 1 : total iteration time 0.0637 s error 3.419e-16 Time for refinement 1.502705e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613349e-16 max(|| b_i - A x_i ||_1) 8.485980e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141301e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613349e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613349e-16 max(|| b_i - A x_i ||_1) 8.485980e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141301e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613349e-16 max(|| b_i - A x_i ||_1) 8.485980e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141301e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.485980e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141301e-03 (SUCCESS) Start 2519: mpi_dst_example_simple_lap_z_facto4_sched0_kway_tqrcpend Test #2520: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpbegin ...***Timeout 495.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.717579e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.605315e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.448382e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.144004e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.023742e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.885170e-01 s Time to initialize coeftab 1.404973e+00 s Time to factorize 6.546969e+00 s ( 3.25 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.704217e-02 s - iteration 1 : total iteration time 0.0153 s error 1.5512e-14 Time for refinement 2.967839e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550836e-14 max(|| b_i - A x_i ||_1) 2.328995e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876843e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550836e-14 max(|| b_i - A x_i ||_1) 2.328995e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876843e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550836e-14 max(|| b_i - A x_i ||_1) 2.328995e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876843e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550836e-14 max(|| b_i - A x_i ||_1) 2.328995e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876843e-02 (SUCCESS) Start 2520: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpbegin Test #2521: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpend .....***Timeout 495.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.226087e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.820374e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.986671e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.178813e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.741482e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.613307e-01 s Time to initialize coeftab 1.852937e-01 s Time to factorize 1.255033e+00 s (16.98 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 5.184635e-02 s - iteration 1 : total iteration time 0.0367 s error 3.419e-16 Time for refinement 1.093465e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) Start 2521: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpend Test #2522: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrrtbegin ...............***Timeout 495.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.247739e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.487697e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.025831e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.146031e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.575551e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 7.297414e-02 s Time to initialize coeftab 1.733437e+00 s Time to factorize 3.563335e+00 s ( 5.98 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 5.977518e-02 s - iteration 1 : total iteration time 0.108 s error 4.5455e-09 - iteration 2 : total iteration time 0.109 s error 5.4821e-10 - iteration 3 : total iteration time 0.0707 s error 3.7667e-13 Time for refinement 4.643037e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) Start 2522: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrrtbegin Test #2523: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrrtend .................***Timeout 495.35 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.925767e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.602907e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.480423e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.519313e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.184717e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.421181e-01 s Time to initialize coeftab 5.627122e-01 s Time to factorize 2.803008e+00 s ( 7.60 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 9.289168e-02 s - iteration 1 : total iteration time 0.124 s error 3.6647e-16 Time for refinement 2.778509e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.921419e-16 max(|| b_i - A x_i ||_1) 9.389207e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.369215e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.921419e-16 max(|| b_i - A x_i ||_1) 9.389207e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.369215e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.921419e-16 max(|| b_i - A x_i ||_1) 9.389207e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.369215e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.921419e-16 max(|| b_i - A x_i ||_1) 9.389207e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.369215e-03 (SUCCESS) Start 2523: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrrtend Test #2524: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrrtbegin ..............***Timeout 495.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.205757e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.277941e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.859649e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.096115e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.864713e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.407324e-01 s Time to initialize coeftab 7.206999e-01 s Time to factorize 2.704976e+01 s (806.64 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.795253e-01 s - iteration 1 : total iteration time 0.166 s error 4.5455e-09 - iteration 2 : total iteration time 0.17 s error 5.4821e-10 - iteration 3 : total iteration time 0.339 s error 3.7667e-13 Time for refinement 9.007567e-01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259504e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701494e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259504e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701494e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259504e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701494e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259504e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701494e-01 (SUCCESS) Start 2524: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrrtbegin Test #2525: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrrtend ................***Timeout 495.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.497850e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.145752e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.456838e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.438178e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.037916e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.911814e-01 s Time to initialize coeftab 1.219262e-01 s Time to factorize 1.072689e+01 s ( 1.99 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 6.370381e-01 s - iteration 1 : total iteration time 0.412 s error 3.6647e-16 Time for refinement 9.425829e-01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) Start 2525: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrrtend Test #2526: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtbegin ...***Timeout 495.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.967978e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.720447e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.760271e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.257406e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.424721e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.616743e-01 s Time to initialize coeftab 1.884643e+00 s Time to factorize 9.184478e+00 s ( 2.32 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.580096e-02 s - iteration 1 : total iteration time 0.0202 s error 4.5455e-09 - iteration 2 : total iteration time 0.0341 s error 5.4821e-10 - iteration 3 : total iteration time 0.033 s error 3.7667e-13 Time for refinement 1.352385e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259504e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701494e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259504e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701494e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.259504e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701494e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.259504e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701494e-01 (SUCCESS) Start 2526: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtbegin Test #2527: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtend .....***Timeout 495.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.108385e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.161222e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.166929e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.066953e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.262543e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.150672e-01 s Time to initialize coeftab 1.493890e-01 s Time to factorize 4.097674e+00 s ( 5.20 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.370025e-01 s - iteration 1 : total iteration time 0.314 s error 3.6647e-16 Time for refinement 7.092849e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.921419e-16 max(|| b_i - A x_i ||_1) 9.389207e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.369215e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.921419e-16 max(|| b_i - A x_i ||_1) 9.389207e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.369215e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.921419e-16 max(|| b_i - A x_i ||_1) 9.389207e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.369215e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.921419e-16 max(|| b_i - A x_i ||_1) 9.389207e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.369215e-03 (SUCCESS) Start 2527: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtend Test #2528: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpilu0 ...............***Timeout 495.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.536421e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.392364e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.259857e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.091065e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.794308e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.586703e-01 s Time to initialize coeftab 1.137112e+00 s Time to factorize 6.033498e+00 s ( 3.53 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.463585e+00 s - iteration 1 : total iteration time 0.347 s error 6.1304e-15 Time for refinement 5.365601e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) Start 2528: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpilu0 Test #2529: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpilu1 ...............***Timeout 495.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.148404e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.313672e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.730377e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.220300e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.293253e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.010496e-01 s Time to initialize coeftab 1.374197e-01 s Time to factorize 6.377236e+00 s ( 3.34 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.098501e-01 s - iteration 1 : total iteration time 0.424 s error 6.1304e-15 Time for refinement 7.702986e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) Start 2529: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpilu1 Test #2531: mpi_dst_example_simple_lap_s_facto0_sched1_not_svdend ...................***Timeout 496.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.723063e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.478835e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.267585e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.862027e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.103637e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.234332e-01 s Time to initialize coeftab 1.100109e-01 s Time to factorize 5.659576e+00 s (915.93 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.374908e+01 s Time for refinement 1.221422e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.982862e-07 max(|| b_i - A x_i ||_1) 8.703612e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.093688e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.982862e-07 max(|| b_i - A x_i ||_1) 8.703612e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.093688e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.982862e-07 max(|| b_i - A x_i ||_1) 8.703612e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.093688e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.982862e-07 max(|| b_i - A x_i ||_1) 8.703612e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.093688e+00 (SUCCESS) Start 2531: mpi_dst_example_simple_lap_s_facto0_sched1_not_svdend Test #2532: mpi_dst_example_simple_lap_s_facto0_sched1_kway_svdbegin ................***Timeout 496.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.482137e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.393344e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.595581e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.704478e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.867356e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.896681e-01 s Time to initialize coeftab 1.436976e+00 s Time to factorize 9.110472e+00 s (568.99 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.870533e+01 s Time for refinement 1.163368e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.119437e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.119437e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.119437e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.119437e-07 max(|| b_i - A x_i ||_1) 9.673837e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.215605e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.673837e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.215605e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.673837e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.215605e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.673837e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.215605e+00 (SUCCESS) Start 2532: mpi_dst_example_simple_lap_s_facto0_sched1_kway_svdbegin Test #2533: mpi_dst_example_simple_lap_s_facto0_sched1_kway_svdend ..................***Timeout 496.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.858392e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.043941e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.548658e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.113870e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.043777e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.245841e-01 s Time to initialize coeftab 4.807614e-01 s Time to factorize 2.148213e+00 s ( 2.36 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.298406e-01 s Time for refinement 8.418414e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.952938e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.952938e-07 max(|| b_i - A x_i ||_1) 8.697889e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.092968e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.697889e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.092968e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.952938e-07 max(|| b_i - A x_i ||_1) 8.697889e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.092968e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.952938e-07 max(|| b_i - A x_i ||_1) 8.697889e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.092968e+00 (SUCCESS) Start 2533: mpi_dst_example_simple_lap_s_facto0_sched1_kway_svdend Test #2534: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_svdbegin .....***Timeout 496.11 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.259255e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.228953e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.780263e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.334030e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.607372e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.671814e-01 s Time to initialize coeftab 6.260876e-01 s Time to factorize 6.897366e+00 s (751.56 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.712416e+01 s Time for refinement 2.269733e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.078858e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.078858e-07 max(|| b_i - A x_i ||_1) 9.626836e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.209699e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.078858e-07 max(|| b_i - A x_i ||_1) 9.626836e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.209699e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.078858e-07 max(|| b_i - A x_i ||_1) 9.626836e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.209699e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.626836e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.209699e+00 (SUCCESS) Start 2534: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_svdbegin Test #2535: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_svdend .......***Timeout 496.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.147977e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.175228e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.297034e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.524537e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.641021e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.891287e-01 s Time to initialize coeftab 2.641889e+00 s Time to factorize 6.448078e+00 s (803.93 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.479685e+01 s Time for refinement 1.240087e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.899902e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.899902e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.899902e-07 max(|| b_i - A x_i ||_1) 8.501603e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.068303e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.899902e-07 max(|| b_i - A x_i ||_1) 8.501603e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.068303e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.501603e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.068303e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.501603e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.068303e+00 (SUCCESS) Start 2535: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_svdend Test #2536: mpi_dst_example_simple_lap_s_facto0_sched1_not_pqrcpbegin ...............***Timeout 496.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.545648e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.664753e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.222418e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.096057e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.777467e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.301602e-02 s Time to initialize coeftab 2.256037e-01 s Time to factorize 2.021540e+00 s ( 2.50 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.097201e-01 s - iteration 1 : total iteration time 1.43 s error 3.1698e-11 Time for refinement 2.696676e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.146881e-08 max(|| b_i - A x_i ||_1) 3.012851e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785920e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.146881e-08 max(|| b_i - A x_i ||_1) 3.012851e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785920e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.146881e-08 max(|| b_i - A x_i ||_1) 3.012851e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785920e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.146881e-08 max(|| b_i - A x_i ||_1) 3.012851e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785920e-01 (SUCCESS) Start 2536: mpi_dst_example_simple_lap_s_facto0_sched1_not_pqrcpbegin Test #2537: mpi_dst_example_simple_lap_s_facto0_sched1_not_pqrcpend .................***Timeout 496.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.299871e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.870663e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.937319e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.057909e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.462294e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.905297e-01 s Time to initialize coeftab 3.216757e-01 s Time to factorize 1.326364e+00 s ( 3.82 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.060700e-01 s Time for refinement 9.012768e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.147438e-07 max(|| b_i - A x_i ||_1) 1.280870e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.609529e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.147438e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.147438e-07 max(|| b_i - A x_i ||_1) 1.280870e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.609529e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.280870e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.609529e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.147438e-07 max(|| b_i - A x_i ||_1) 1.280870e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.609529e+00 (SUCCESS) Start 2537: mpi_dst_example_simple_lap_s_facto0_sched1_not_pqrcpend Test #2538: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpbegin ..............***Timeout 496.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.138342e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.890840e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.065363e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.870292e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.180696e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.306660e+01 s Time to initialize coeftab 1.710157e+00 s Time to factorize 6.398335e+00 s (810.18 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.646080e-01 s - iteration 1 : total iteration time 11.6 s error 5.0581e-11 Time for refinement 5.486828e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.991407e-08 max(|| b_i - A x_i ||_1) 2.916308e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.664604e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.991407e-08 max(|| b_i - A x_i ||_1) 2.916308e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.664604e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.991407e-08 max(|| b_i - A x_i ||_1) 2.916308e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.664604e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.991407e-08 max(|| b_i - A x_i ||_1) 2.916308e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.664604e-01 (SUCCESS) Start 2538: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpbegin Test #2539: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpend ................***Timeout 496.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.105027e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.102309e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.381489e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.793671e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.313463e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.365424e-01 s Time to initialize coeftab 1.069999e-01 s Time to factorize 2.325039e+00 s ( 2.18 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.986241e-01 s Time for refinement 5.965665e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.459463e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.459463e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.459463e-07 max(|| b_i - A x_i ||_1) 1.031509e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.296184e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.031509e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.296184e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.031509e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.296184e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.459463e-07 max(|| b_i - A x_i ||_1) 1.031509e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.296184e+00 (SUCCESS) Start 2539: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpend Test #2540: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpbegin ...***Timeout 496.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.729157e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.842170e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.869846e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.876786e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.070529e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.496711e-01 s Time to initialize coeftab 2.874800e+00 s Time to factorize 2.179992e+01 s (237.79 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.092735e+01 s - iteration 1 : total iteration time 6.9 s error 4.2047e-11 Time for refinement 1.537178e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.130093e-08 max(|| b_i - A x_i ||_1) 3.039488e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.819391e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.130093e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.130093e-08 max(|| b_i - A x_i ||_1) 3.039488e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.819391e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.130093e-08 max(|| b_i - A x_i ||_1) 3.039488e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.819391e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.039488e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.819391e-01 (SUCCESS) Start 2540: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpbegin Test #2541: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpend .....***Timeout 496.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.418984e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.823716e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.531734e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.098261e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.341952e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.704162e+00 s Time to initialize coeftab 1.596675e+00 s Time to factorize 2.188986e+00 s ( 2.31 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.544163e+00 s Time for refinement 4.233358e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.687662e-07 max(|| b_i - A x_i ||_1) 1.197221e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.504416e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.687662e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.687662e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.687662e-07 max(|| b_i - A x_i ||_1) 1.197221e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.504416e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.197221e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.504416e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.197221e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.504416e+00 (SUCCESS) Start 2541: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpend Test #2542: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrcpbegin ...............***Timeout 496.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.482382e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.414458e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.126812e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.300453e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.689018e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.938264e-02 s Time to initialize coeftab 3.771290e-01 s Time to factorize 3.611678e+00 s ( 1.40 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 1.021045e+00 s - iteration 1 : total iteration time 1.42 s error 3.5577e-11 Time for refinement 2.811990e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.168802e-08 max(|| b_i - A x_i ||_1) 3.052366e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.835573e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.168802e-08 max(|| b_i - A x_i ||_1) 3.052366e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.835573e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.168802e-08 max(|| b_i - A x_i ||_1) 3.052366e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.835573e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.168802e-08 max(|| b_i - A x_i ||_1) 3.052366e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.835573e-01 (SUCCESS) Start 2542: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrcpbegin Test #2543: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrcpend .................***Timeout 496.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.094498e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.600394e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.946742e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.826195e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.320300e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.383531e-01 s Time to initialize coeftab 1.744653e-01 s Time to factorize 2.461184e+00 s ( 2.06 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Start 2543: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrcpend Test #2544: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrcpbegin ..............***Timeout 496.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.398443e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.888474e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.219501e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.771462e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.659017e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.886787e-01 s Time to initialize coeftab 5.268911e-01 s Time to factorize 5.884834e+00 s (880.87 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 9.792545e+00 s - iteration 1 : total iteration time 6.44 s error 5.6653e-11 Time for refinement 4.648948e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.185659e-08 max(|| b_i - A x_i ||_1) 3.007333e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.778986e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.185659e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.185659e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.185659e-08 max(|| b_i - A x_i ||_1) 3.007333e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.778986e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.007333e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.778986e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.007333e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.778986e-01 (SUCCESS) Start 2544: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrcpbegin Test #2545: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrcpend ................***Timeout 496.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.781246e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.267670e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.096613e+01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.019036e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.100586e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.767723e-01 s Time to initialize coeftab 8.308391e-02 s Time to factorize 2.763882e+00 s ( 1.83 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.540193e+00 s Time for refinement 2.296832e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.295300e-07 max(|| b_i - A x_i ||_1) 9.966475e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.252378e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.295300e-07 max(|| b_i - A x_i ||_1) 9.966475e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.252378e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.295300e-07 max(|| b_i - A x_i ||_1) 9.966475e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.252378e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.295300e-07 max(|| b_i - A x_i ||_1) 9.966475e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.252378e+00 (SUCCESS) Start 2545: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrcpend Test #2546: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpbegin ...***Timeout 496.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.077463e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.755402e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.486436e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.366491e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.209752e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.490558e-01 s Time to initialize coeftab 4.870201e-01 s Time to factorize 1.097144e+00 s ( 4.61 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 8.531108e-01 s - iteration 1 : total iteration time 1.39 s error 3.225e-11 Time for refinement 3.123530e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.075538e-08 max(|| b_i - A x_i ||_1) 3.016129e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790038e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.075538e-08 max(|| b_i - A x_i ||_1) 3.016129e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790038e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.075538e-08 max(|| b_i - A x_i ||_1) 3.016129e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790038e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.075538e-08 max(|| b_i - A x_i ||_1) 3.016129e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.790038e-01 (SUCCESS) Start 2546: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpbegin Test #2547: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpend .....***Timeout 496.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.132570e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.694440e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.431574e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.278210e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.306537e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.040261e-01 s Time to initialize coeftab 3.440101e-01 s Time to factorize 2.394750e+00 s ( 2.11 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.175604e+00 s Time for refinement 8.210989e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.203837e-07 max(|| b_i - A x_i ||_1) 1.288335e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.618910e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.203837e-07 max(|| b_i - A x_i ||_1) 1.288335e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.618910e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.203837e-07 max(|| b_i - A x_i ||_1) 1.288335e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.618910e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.203837e-07 max(|| b_i - A x_i ||_1) 1.288335e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.618910e+00 (SUCCESS) Start 2547: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpend Test #2548: mpi_dst_example_simple_lap_s_facto0_sched1_not_tqrcpbegin ...............***Timeout 496.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.686464e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.518892e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.022118e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.216875e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.007670e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.962556e-01 s Time to initialize coeftab 5.027595e-01 s Time to factorize 5.265258e+00 s (984.53 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 9.931461e-01 s - iteration 1 : total iteration time 10.7 s error 4.9351e-11 Time for refinement 5.369255e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.972402e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.972402e-08 max(|| b_i - A x_i ||_1) 2.925179e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.675751e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.972402e-08 max(|| b_i - A x_i ||_1) 2.925179e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.675751e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.925179e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.675751e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.972402e-08 max(|| b_i - A x_i ||_1) 2.925179e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.675751e-01 (SUCCESS) Start 2548: mpi_dst_example_simple_lap_s_facto0_sched1_not_tqrcpbegin Test #2549: mpi_dst_example_simple_lap_s_facto0_sched1_not_tqrcpend .................***Timeout 496.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.159694e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.244954e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.664390e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.733727e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.273025e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.735028e-01 s Time to initialize coeftab 8.883298e-02 s Time to factorize 5.044087e+00 s ( 1.00 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 6.765675e-01 s Time for refinement 1.234724e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.903478e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.903478e-07 max(|| b_i - A x_i ||_1) 8.514470e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069920e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.514470e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069920e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.903478e-07 max(|| b_i - A x_i ||_1) 8.514470e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069920e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.903478e-07 max(|| b_i - A x_i ||_1) 8.514470e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069920e+00 (SUCCESS) Start 2549: mpi_dst_example_simple_lap_s_facto0_sched1_not_tqrcpend Test #2551: mpi_dst_example_simple_lap_s_facto0_sched1_kway_tqrcpend ................***Timeout 497.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.235216e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.145153e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.422890e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.545380e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.425091e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.910856e-01 s Time to initialize coeftab 5.678105e-01 s Time to factorize 7.343946e-01 s ( 6.89 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 6.374094e-01 s Time for refinement 5.382488e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.674541e-07 max(|| b_i - A x_i ||_1) 1.159586e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.457125e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.674541e-07 max(|| b_i - A x_i ||_1) 1.159586e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.457125e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.674541e-07 max(|| b_i - A x_i ||_1) 1.159586e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.457125e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.674541e-07 max(|| b_i - A x_i ||_1) 1.159586e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.457125e+00 (SUCCESS) Start 2551: mpi_dst_example_simple_lap_s_facto0_sched1_kway_tqrcpend Test #2552: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpbegin ...***Timeout 497.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.063747e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.032955e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.108664e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.563179e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.288501e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.725198e-01 s Time to initialize coeftab 2.924185e+00 s Time to factorize 2.115984e+01 s (244.98 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 1.660996e+01 s - iteration 1 : total iteration time 6.81 s error 3.6672e-11 Time for refinement 1.727927e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.798388e-08 max(|| b_i - A x_i ||_1) 2.879787e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.618713e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.798388e-08 max(|| b_i - A x_i ||_1) 2.879787e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.618713e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.798388e-08 max(|| b_i - A x_i ||_1) 2.879787e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.618713e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.798388e-08 max(|| b_i - A x_i ||_1) 2.879787e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.618713e-01 (SUCCESS) Start 2552: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpbegin Test #2553: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpend .....***Timeout 497.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.142830e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.632670e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.763472e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.289471e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.189066e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.938457e-01 s Time to initialize coeftab 2.385426e-01 s Time to factorize 2.233966e+00 s ( 2.27 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.742610e-01 s Time for refinement 4.011304e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.671397e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.671397e-07 max(|| b_i - A x_i ||_1) 1.176982e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.478984e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.671397e-07 max(|| b_i - A x_i ||_1) 1.176982e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.478984e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.671397e-07 max(|| b_i - A x_i ||_1) 1.176982e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.478984e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.176982e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.478984e+00 (SUCCESS) Start 2553: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpend Test #2555: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrrtend .................***Timeout 497.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.740583e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.755573e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.007391e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.213191e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.153092e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.499173e-01 s Time to initialize coeftab 9.324254e-02 s Time to factorize 4.560380e+00 s ( 1.11 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.296675e-01 s Time for refinement 3.922879e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.356669e-07 max(|| b_i - A x_i ||_1) 1.385227e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.740663e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.356669e-07 max(|| b_i - A x_i ||_1) 1.385227e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.740663e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.356669e-07 max(|| b_i - A x_i ||_1) 1.385227e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.740663e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.356669e-07 max(|| b_i - A x_i ||_1) 1.385227e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.740663e+00 (SUCCESS) Start 2555: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrrtend Test #2556: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrrtbegin ..............***Timeout 497.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.912464e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.549528e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.105632e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.069714e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.601800e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.213958e-01 s Time to initialize coeftab 1.442790e+00 s Time to factorize 4.395993e+00 s ( 1.15 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.808421e+01 s - iteration 1 : total iteration time 8.55 s error 5.2229e-11 Time for refinement 2.726633e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.137405e-08 max(|| b_i - A x_i ||_1) 3.029319e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.806614e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.137405e-08 max(|| b_i - A x_i ||_1) 3.029319e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.806614e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.137405e-08 max(|| b_i - A x_i ||_1) 3.029319e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.806614e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.137405e-08 max(|| b_i - A x_i ||_1) 3.029319e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.806614e-01 (SUCCESS) Start 2556: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrrtbegin Test #2557: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrrtend ................***Timeout 497.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.078641e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.268994e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.879912e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.370330e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.348060e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.463467e-01 s Time to initialize coeftab 1.569854e-01 s Time to factorize 5.340501e-01 s ( 9.48 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.061358e-01 s Time for refinement 1.741576e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.976651e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.976651e-07 max(|| b_i - A x_i ||_1) 8.609523e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.081864e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.976651e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.976651e-07 max(|| b_i - A x_i ||_1) 8.609523e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.081864e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.609523e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.081864e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.609523e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.081864e+00 (SUCCESS) Start 2557: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrrtend Test #2558: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtbegin ...***Timeout 497.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.443952e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.491942e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.108124e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.265379e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.644509e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.422394e-01 s Time to initialize coeftab 3.072230e-01 s Time to factorize 5.294165e+00 s (979.15 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.971884e-01 s - iteration 1 : total iteration time 1.5 s error 4.9983e-11 Time for refinement 2.772777e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.120935e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.120935e-08 max(|| b_i - A x_i ||_1) 2.983267e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.748745e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.120935e-08 max(|| b_i - A x_i ||_1) 2.983267e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.748745e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.983267e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.748745e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.120935e-08 max(|| b_i - A x_i ||_1) 2.983267e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.748745e-01 (SUCCESS) Start 2558: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtbegin Test #2559: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtend .....***Timeout 497.86 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.102658e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.836716e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.934455e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.305611e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.188938e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.175293e-01 s Time to initialize coeftab 7.146832e-02 s Time to factorize 1.866844e+00 s ( 2.71 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.128875e+00 s Time for refinement 1.216010e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.034566e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.034566e-07 max(|| b_i - A x_i ||_1) 9.614900e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.208199e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.614900e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.208199e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.034566e-07 max(|| b_i - A x_i ||_1) 9.614900e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.208199e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.034566e-07 max(|| b_i - A x_i ||_1) 9.614900e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.208199e+00 (SUCCESS) Start 2559: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtend Test #2560: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpilu0 ...............***Timeout 497.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.045731e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.079123e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.170999e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.044895e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.262014e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.612079e-01 s Time to initialize coeftab 3.917805e-01 s Time to factorize 2.727708e+00 s ( 1.86 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.541941e+00 s - iteration 1 : total iteration time 47.1 s error 2.6977e-11 Time for refinement 5.254366e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.935525e-08 max(|| b_i - A x_i ||_1) 2.874617e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.612216e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.935525e-08 max(|| b_i - A x_i ||_1) 2.874617e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.612216e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.935525e-08 max(|| b_i - A x_i ||_1) 2.874617e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.612216e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.935525e-08 max(|| b_i - A x_i ||_1) 2.874617e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.612216e-01 (SUCCESS) Start 2560: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpilu0 Test #2562: mpi_dst_example_simple_lap_s_facto1_sched1_not_svdbegin .................***Timeout 498.42 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.605729e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.837958e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.514042e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.221263e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.163458e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.385536e-01 s Time to initialize coeftab 7.200945e-01 s Time to factorize 4.417434e+00 s ( 1.18 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.1 Ko / 44.3 Ko ------------------------------------------------ Total 68.3 Ko / 68.5 Ko Time to solve 2.917330e+01 s Time for refinement 1.324836e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.912017e-07 max(|| b_i - A x_i ||_1) 8.325644e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.046192e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.912017e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.912017e-07 max(|| b_i - A x_i ||_1) 8.325644e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.046192e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.912017e-07 max(|| b_i - A x_i ||_1) 8.325644e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.046192e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.325644e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.046192e+00 (SUCCESS) Start 2562: mpi_dst_example_simple_lap_s_facto1_sched1_not_svdbegin Test #2563: mpi_dst_example_simple_lap_s_facto1_sched1_not_svdend ...................***Timeout 498.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.283475e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.868735e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.191498e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.014990e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.514808e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.455813e-01 s Time to initialize coeftab 7.411807e-01 s Time to factorize 6.037985e+00 s (887.56 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.685732e+01 s Time for refinement 2.651657e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.700328e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.700328e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.700328e-07 max(|| b_i - A x_i ||_1) 7.497481e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.421263e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.700328e-07 max(|| b_i - A x_i ||_1) 7.497481e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.421263e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.497481e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.421263e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.497481e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.421263e-01 (SUCCESS) Start 2563: mpi_dst_example_simple_lap_s_facto1_sched1_not_svdend Test #2566: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_svdbegin .....***Timeout 499.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.860235e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.041890e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.703528e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.606554e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.872860e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.072557e-01 s Time to initialize coeftab 1.567188e+00 s Time to factorize 8.662404e+00 s (618.66 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.217485e+01 s Time for refinement 5.627779e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996109e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996109e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996109e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996109e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.925748e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.925748e-07 max(|| b_i - A x_i ||_1) 8.646692e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.086535e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.925748e-07 max(|| b_i - A x_i ||_1) 8.646692e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.086535e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.925748e-07 max(|| b_i - A x_i ||_1) 8.646692e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.086535e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.646692e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.086535e+00 (SUCCESS) Start 2566: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_svdbegin Test #2569: mpi_dst_example_simple_lap_s_facto1_sched1_not_pqrcpend .................***Timeout 499.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch 1: 300 1140 2: 200 760 3: 200 660 Time to compute ordering 5.938211e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.670255e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.337792e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.550180e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.069555e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.697030e-01 s Time to initialize coeftab 1.305709e-01 s Time to factorize 5.427833e+00 s (987.33 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.747907e-01 s Time for refinement 1.177358e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.044738e-07 max(|| b_i - A x_i ||_1) 1.197624e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.504923e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.044738e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.044738e-07 max(|| b_i - A x_i ||_1) 1.197624e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.504923e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.197624e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.504923e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.044738e-07 max(|| b_i - A x_i ||_1) 1.197624e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.504923e+00 (SUCCESS) Start 2569: mpi_dst_example_simple_lap_s_facto1_sched1_not_pqrcpend Test #2570: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpbegin ..............***Timeout 499.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.052668e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.674436e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.062786e+01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.481545e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.182018e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.701479e-01 s Time to initialize coeftab 1.870630e-01 s Time to factorize 4.952084e+00 s ( 1.06 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.105015e+00 s - iteration 1 : total iteration time 31.5 s error 5.4677e-11 Time for refinement 5.393217e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.824286e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.824286e-08 max(|| b_i - A x_i ||_1) 2.905100e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.650521e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.824286e-08 max(|| b_i - A x_i ||_1) 2.905100e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.650521e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.824286e-08 max(|| b_i - A x_i ||_1) 2.905100e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.650521e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.905100e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.650521e-01 (SUCCESS) Start 2570: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpbegin Test #2575: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrcpend .................***Timeout 500.99 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.398605e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.984909e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.526218e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.173368e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.725214e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.102549e-01 s Time to initialize coeftab 6.872160e-01 s Time to factorize 2.896242e+00 s ( 1.81 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.905436e+01 s Time for refinement 8.428873e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.541626e-07 max(|| b_i - A x_i ||_1) 1.301685e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.635685e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.541626e-07 max(|| b_i - A x_i ||_1) 1.301685e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.635685e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.541626e-07 max(|| b_i - A x_i ||_1) 1.301685e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.635685e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.541626e-07 max(|| b_i - A x_i ||_1) 1.301685e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.635685e+00 (SUCCESS) Start 2575: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrcpend 2391/3626 Test #2611: mpi_dst_example_simple_lap_s_facto2_sched1_not_tqrcpbegin ...............***Timeout 501.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.439481e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.793332e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.914177e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.874728e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.856563e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.508309e-01 s Time to initialize coeftab 9.591774e-01 s Time to factorize 9.541169e+00 s ( 1.05 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.532774e+01 s - iteration 1 : total iteration time 8.36 s error 4.3418e-11 Time for refinement 2.504142e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.919963e-08 max(|| b_i - A x_i ||_1) 2.966087e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.727157e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.919963e-08 max(|| b_i - A x_i ||_1) 2.966087e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.727157e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.919963e-08 max(|| b_i - A x_i ||_1) 2.966087e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.727157e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.919963e-08 max(|| b_i - A x_i ||_1) 2.966087e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.727157e-01 (SUCCESS) Start 2611: mpi_dst_example_simple_lap_s_facto2_sched1_not_tqrcpbegin 2391/3626 Test #2612: mpi_dst_example_simple_lap_s_facto2_sched1_not_tqrcpend .................***Timeout 501.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.460217e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.285004e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.624591e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.740049e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.022338e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.822331e-01 s Time to initialize coeftab 5.994916e-02 s Time to factorize 1.089849e+01 s (938.13 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 3.419317e+01 s - iteration 1 : total iteration time 6.77 s error 1.1737e-12 Time for refinement 1.756374e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.594887e-08 max(|| b_i - A x_i ||_1) 2.736361e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.438485e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.594887e-08 max(|| b_i - A x_i ||_1) 2.736361e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.438485e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.594887e-08 max(|| b_i - A x_i ||_1) 2.736361e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.438485e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.594887e-08 max(|| b_i - A x_i ||_1) 2.736361e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.438485e-01 (SUCCESS) Start 2612: mpi_dst_example_simple_lap_s_facto2_sched1_not_tqrcpend 2391/3626 Test #2613: mpi_dst_example_simple_lap_s_facto2_sched1_kway_tqrcpbegin ..............***Timeout 501.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.721999e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.665254e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.556315e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.831800e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.209832e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.820387e-01 s Time to initialize coeftab 5.107421e-01 s Time to factorize 5.856941e+00 s ( 1.70 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.5 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 3.167934e+01 s - iteration 1 : total iteration time 10.2 s error 4.07e-11 Time for refinement 2.825670e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.219165e-08 max(|| b_i - A x_i ||_1) 3.011281e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.783947e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.219165e-08 max(|| b_i - A x_i ||_1) 3.011281e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.783947e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.219165e-08 max(|| b_i - A x_i ||_1) 3.011281e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.783947e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.219165e-08 max(|| b_i - A x_i ||_1) 3.011281e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.783947e-01 (SUCCESS) Start 2613: mpi_dst_example_simple_lap_s_facto2_sched1_kway_tqrcpbegin 2391/3626 Test #2614: mpi_dst_example_simple_lap_s_facto2_sched1_kway_tqrcpend ................***Timeout 501.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.668970e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.483568e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.728516e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.774718e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.304243e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.199687e-02 s Time to initialize coeftab 1.423797e+00 s Time to factorize 9.767230e+00 s ( 1.02 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 9.797284e+00 s - iteration 1 : total iteration time 15.8 s error 5.9454e-13 Time for refinement 2.049140e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.588697e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.588697e-08 max(|| b_i - A x_i ||_1) 2.793608e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.510421e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.588697e-08 max(|| b_i - A x_i ||_1) 2.793608e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.510421e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.793608e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.510421e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.588697e-08 max(|| b_i - A x_i ||_1) 2.793608e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.510421e-01 (SUCCESS) Start 2614: mpi_dst_example_simple_lap_s_facto2_sched1_kway_tqrcpend 2391/3626 Test #2618: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrrtend .................***Timeout 502.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.126960e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.761564e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.863991e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.786014e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.252573e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.459372e-01 s Time to initialize coeftab 1.712043e-01 s Time to factorize 3.470786e+00 s ( 2.88 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 3.959849e-01 s - iteration 1 : total iteration time 1.24 s error 1.7943e-12 Time for refinement 2.663006e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.497560e-08 max(|| b_i - A x_i ||_1) 2.720898e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.419054e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.497560e-08 max(|| b_i - A x_i ||_1) 2.720898e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.419054e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.497560e-08 max(|| b_i - A x_i ||_1) 2.720898e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.419054e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.497560e-08 max(|| b_i - A x_i ||_1) 2.720898e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.419054e-01 (SUCCESS) Start 2618: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrrtend 2391/3626 Test #2620: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrrtend ................***Timeout 502.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.030719e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.113582e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.002076e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.845849e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.121253e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.118900e+00 s Time to initialize coeftab 3.414571e-01 s Time to factorize 4.294987e+00 s ( 2.32 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.546189e+01 s - iteration 1 : total iteration time 6.08 s error 3.9744e-12 Time for refinement 2.419241e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.791813e-08 max(|| b_i - A x_i ||_1) 2.826052e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.551189e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.791813e-08 max(|| b_i - A x_i ||_1) 2.826052e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.551189e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.791813e-08 max(|| b_i - A x_i ||_1) 2.826052e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.551189e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.791813e-08 max(|| b_i - A x_i ||_1) 2.826052e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.551189e-01 (SUCCESS) Start 2620: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrrtend 2391/3626 Test #2622: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtend .....***Timeout 502.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.193534e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.090848e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.706356e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.390192e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.937606e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.466180e-01 s Time to initialize coeftab 1.865605e-01 s Time to factorize 4.178380e+00 s ( 2.39 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.008707e+01 s - iteration 1 : total iteration time 6.81 s error 2.1305e-12 Time for refinement 4.639140e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.646880e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.646880e-08 max(|| b_i - A x_i ||_1) 2.722007e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.420448e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.646880e-08 max(|| b_i - A x_i ||_1) 2.722007e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.420448e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.646880e-08 max(|| b_i - A x_i ||_1) 2.722007e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.420448e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.722007e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.420448e-01 (SUCCESS) Start 2622: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtend 2391/3626 Test #2623: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpilu0 ...............***Timeout 502.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.014927e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.512008e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.969635e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.718754e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.520354e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.560766e-01 s Time to initialize coeftab 1.729871e-01 s Time to factorize 4.314360e+00 s ( 2.31 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 9.259273e-01 s - iteration 1 : total iteration time 18.5 s error 1.0408e-11 Time for refinement 5.591412e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.880571e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.880571e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.880571e-08 max(|| b_i - A x_i ||_1) 2.908079e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.654265e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.880571e-08 max(|| b_i - A x_i ||_1) 2.908079e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.654265e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.908079e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.654265e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.908079e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.654265e-01 (SUCCESS) Start 2623: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpilu0 2391/3626 Test #2624: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpilu1 ...............***Timeout 502.76 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.164537e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.997480e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.760486e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.185567e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.198070e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.103053e-01 s Time to initialize coeftab 1.319976e-01 s Time to factorize 3.853599e+00 s ( 2.59 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.082741e+00 s - iteration 1 : total iteration time 9.62 s error 1.0471e-11 Time for refinement 5.741232e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.936663e-08 max(|| b_i - A x_i ||_1) 2.912133e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.659358e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.936663e-08 max(|| b_i - A x_i ||_1) 2.912133e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.659358e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.936663e-08 max(|| b_i - A x_i ||_1) 2.912133e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.659358e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.936663e-08 max(|| b_i - A x_i ||_1) 2.912133e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.659358e-01 (SUCCESS) Start 2624: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpilu1 2391/3626 Test #2632: mpi_dst_example_simple_lap_d_facto0_sched1_not_pqrcpend .................***Timeout 505.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.606811e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.393261e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.917319e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.491422e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.806622e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.162697e-02 s Time to initialize coeftab 6.856332e-01 s Time to factorize 8.284656e-01 s ( 6.11 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.537687e+00 s - iteration 1 : total iteration time 2.43 s error 3.0228e-16 Time for refinement 5.244188e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.247069e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.247069e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.247069e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.247069e-16 max(|| b_i - A x_i ||_1) 7.486080e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.406903e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.486080e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.406903e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.486080e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.406903e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.486080e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.406903e-04 (SUCCESS) Start 2632: mpi_dst_example_simple_lap_d_facto0_sched1_not_pqrcpend 2391/3626 Test #2633: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpbegin ..............***Timeout 505.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.056444e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.729595e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.856447e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.707311e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.201159e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.273735e-01 s Time to initialize coeftab 1.993908e-01 s Time to factorize 2.936829e+00 s ( 1.72 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.504949e-01 s - iteration 1 : total iteration time 1.75 s error 1.5278e-14 Time for refinement 2.817619e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.528222e-14 max(|| b_i - A x_i ||_1) 2.922616e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.672518e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.528222e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.528222e-14 max(|| b_i - A x_i ||_1) 2.922616e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.672518e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.528222e-14 max(|| b_i - A x_i ||_1) 2.922616e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.672518e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.922616e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.672518e-02 (SUCCESS) Start 2633: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpbegin 2391/3626 Test #2634: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpend ................***Timeout 505.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.328076e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.418136e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.066598e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.111488e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.227616e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.108596e-01 s Time to initialize coeftab 8.781993e+00 s Time to factorize 1.672814e+01 s (309.88 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.167167e+01 s - iteration 1 : total iteration time 8.14 s error 3.4923e-16 Time for refinement 1.632229e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.660472e-16 max(|| b_i - A x_i ||_1) 8.247189e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.036330e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.660472e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.660472e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.660472e-16 max(|| b_i - A x_i ||_1) 8.247189e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.036330e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.247189e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.036330e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.247189e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.036330e-03 (SUCCESS) Start 2634: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpend 2391/3626 Test #2636: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_pqrcpend .....***Timeout 505.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.050255e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.017126e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.494827e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.547458e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.294792e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.788683e-01 s Time to initialize coeftab 3.671156e-01 s Time to factorize 9.636590e-01 s ( 5.25 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.356735e-01 s - iteration 1 : total iteration time 1.52 s error 9.6569e-16 Time for refinement 3.541807e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.759445e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.759445e-16 max(|| b_i - A x_i ||_1) 1.311823e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.648418e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.311823e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.648418e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.759445e-16 max(|| b_i - A x_i ||_1) 1.311823e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.648418e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.759445e-16 max(|| b_i - A x_i ||_1) 1.311823e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.648418e-03 (SUCCESS) Start 2636: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_pqrcpend 2391/3626 Test #2637: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrcpbegin ...............***Timeout 505.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.467612e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.125328e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.427327e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.871530e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.819060e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.858756e-01 s Time to initialize coeftab 5.659883e-01 s Time to factorize 4.724697e+00 s ( 1.07 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.749074e+00 s - iteration 1 : total iteration time 7.08 s error 3.197e-14 Time for refinement 4.847075e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.196316e-14 max(|| b_i - A x_i ||_1) 6.527299e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.202113e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.196316e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.196316e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.196316e-14 max(|| b_i - A x_i ||_1) 6.527299e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.202113e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 6.527299e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.202113e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 6.527299e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.202113e-02 (SUCCESS) Start 2637: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrcpbegin 2391/3626 Test #2638: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrcpend .................***Timeout 505.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.826366e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.515501e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.018847e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.964505e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.035558e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.412181e-01 s Time to initialize coeftab 1.433508e-01 s Time to factorize 2.925893e+00 s ( 1.73 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.859001e+00 s - iteration 1 : total iteration time 40.5 s error 6.3259e-15 Time for refinement 5.345654e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.321787e-15 max(|| b_i - A x_i ||_1) 3.439199e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.321649e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.321787e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.321787e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.321787e-15 max(|| b_i - A x_i ||_1) 3.439199e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.321649e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 3.439199e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.321649e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 3.439199e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.321649e-03 (SUCCESS) Start 2638: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrcpend 2391/3626 Test #2641: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpbegin ...***Timeout 505.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.568132e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.413898e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.069965e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.799037e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.815546e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.093034e-01 s Time to initialize coeftab 2.299388e-01 s Time to factorize 5.282112e+00 s (981.39 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Time to solve 1.056107e+00 s - iteration 1 : total iteration time 1.55 s error 3.8891e-14 Time for refinement 2.767952e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.888476e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.888476e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.888476e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.888476e-14 max(|| b_i - A x_i ||_1) 7.732715e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.716821e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 7.732715e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.716821e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 7.732715e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.716821e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 7.732715e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.716821e-02 (SUCCESS) Start 2641: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpbegin 2391/3626 Test #2643: mpi_dst_example_simple_lap_d_facto0_sched1_not_tqrcpbegin ...............***Timeout 505.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.465171e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.024595e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.687051e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.841698e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.094326e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.899755e-01 s Time to initialize coeftab 6.397248e-01 s Time to factorize 6.139692e+00 s (844.31 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Time to solve 3.176667e+01 s - iteration 1 : total iteration time 8.13 s error 6.5949e-14 Time for refinement 2.518496e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.594954e-14 max(|| b_i - A x_i ||_1) 1.128318e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.417828e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.594954e-14 max(|| b_i - A x_i ||_1) 1.128318e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.417828e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.594954e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.594954e-14 max(|| b_i - A x_i ||_1) 1.128318e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.417828e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 1.128318e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.417828e-01 (SUCCESS) Start 2643: mpi_dst_example_simple_lap_d_facto0_sched1_not_tqrcpbegin 2391/3626 Test #2644: mpi_dst_example_simple_lap_d_facto0_sched1_not_tqrcpend .................***Timeout 505.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.620623e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.689798e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.797087e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.478882e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.014773e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.541550e-01 s Time to initialize coeftab 9.282189e-02 s Time to factorize 3.631904e+00 s ( 1.39 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.945720e+00 s - iteration 1 : total iteration time 8.78 s error 2.868e-15 Time for refinement 5.621978e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.874794e-15 max(|| b_i - A x_i ||_1) 3.563077e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.477313e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.874794e-15 max(|| b_i - A x_i ||_1) 3.563077e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.477313e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.874794e-15 max(|| b_i - A x_i ||_1) 3.563077e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.477313e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.874794e-15 max(|| b_i - A x_i ||_1) 3.563077e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.477313e-03 (SUCCESS) Start 2644: mpi_dst_example_simple_lap_d_facto0_sched1_not_tqrcpend 2391/3626 Test #2645: mpi_dst_example_simple_lap_d_facto0_sched1_kway_tqrcpbegin ..............***Timeout 505.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.174552e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.312396e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.249219e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.959403e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.233308e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.985624e-01 s Time to initialize coeftab 2.486054e+00 s Time to factorize 6.575667e+00 s (788.33 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Time to solve 2.989613e+00 s - iteration 1 : total iteration time 8.15 s error 3.7795e-14 Time for refinement 4.201651e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.780194e-14 max(|| b_i - A x_i ||_1) 6.745401e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.476177e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.780194e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.780194e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.780194e-14 max(|| b_i - A x_i ||_1) 6.745401e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.476177e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 6.745401e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.476177e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 6.745401e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.476177e-02 (SUCCESS) Start 2645: mpi_dst_example_simple_lap_d_facto0_sched1_kway_tqrcpbegin 2391/3626 Test #2646: mpi_dst_example_simple_lap_d_facto0_sched1_kway_tqrcpend ................***Timeout 505.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.257477e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.009759e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.015946e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.228122e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.800054e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.789674e-01 s Time to initialize coeftab 1.821293e-01 s Time to factorize 2.838984e+00 s ( 1.78 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.541182e-01 s - iteration 1 : total iteration time 7.37 s error 6.7295e-15 Time for refinement 7.083249e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.738836e-15 max(|| b_i - A x_i ||_1) 5.597333e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.033529e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.738836e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.738836e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.738836e-15 max(|| b_i - A x_i ||_1) 5.597333e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.033529e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 5.597333e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.033529e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 5.597333e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.033529e-03 (SUCCESS) Start 2646: mpi_dst_example_simple_lap_d_facto0_sched1_kway_tqrcpend 2391/3626 Test #2647: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpbegin ...***Timeout 505.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.017847e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.157731e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.350252e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.172953e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.120808e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.073382e-01 s Time to initialize coeftab 6.738403e-01 s Time to factorize 6.675986e+00 s (776.48 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.434287e+01 s - iteration 1 : total iteration time 7.23 s error 3.7991e-14 Time for refinement 2.357078e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.798446e-14 max(|| b_i - A x_i ||_1) 7.515394e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.443738e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.798446e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.798446e-14 max(|| b_i - A x_i ||_1) 7.515394e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.443738e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.798446e-14 max(|| b_i - A x_i ||_1) 7.515394e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.443738e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 7.515394e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.443738e-02 (SUCCESS) Start 2647: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpbegin 2391/3626 Test #2649: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrrtbegin ...............***Timeout 505.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.716562e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.019328e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.660112e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.041087e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.912286e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.115192e-01 s Time to initialize coeftab 4.168536e-01 s Time to factorize 1.001815e+00 s ( 5.05 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.353604e-01 s - iteration 1 : total iteration time 1.91 s error 1.569e-12 - iteration 2 : total iteration time 1.51 s error 4.5907e-18 Time for refinement 5.924092e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.229446e-16 max(|| b_i - A x_i ||_1) 6.095873e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.659989e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.229446e-16 max(|| b_i - A x_i ||_1) 6.095873e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.659989e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.229446e-16 max(|| b_i - A x_i ||_1) 6.095873e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.659989e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.229446e-16 max(|| b_i - A x_i ||_1) 6.095873e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.659989e-04 (SUCCESS) Start 2649: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrrtbegin 2391/3626 Test #2650: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrrtend .................***Timeout 505.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.864852e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.433714e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.767221e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.835048e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.521304e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.549786e-01 s Time to initialize coeftab 3.013379e-01 s Time to factorize 1.738456e+00 s ( 2.91 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.745882e-01 s - iteration 1 : total iteration time 1.09 s error 6.7846e-13 Time for refinement 2.379783e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.784535e-13 max(|| b_i - A x_i ||_1) 4.192599e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.268360e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.784535e-13 max(|| b_i - A x_i ||_1) 4.192599e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.268360e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.784535e-13 max(|| b_i - A x_i ||_1) 4.192599e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.268360e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.784535e-13 max(|| b_i - A x_i ||_1) 4.192599e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.268360e-01 (SUCCESS) Start 2650: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrrtend 2391/3626 Test #2654: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtend .....***Timeout 505.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.393244e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.283348e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.026129e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.218907e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.557493e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.461963e-01 s Time to initialize coeftab 2.249861e-01 s Time to factorize 1.316178e+00 s ( 3.85 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.022077e-01 s - iteration 1 : total iteration time 1.71 s error 5.6057e-15 Time for refinement 3.066570e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.612335e-15 max(|| b_i - A x_i ||_1) 6.383292e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.021155e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.612335e-15 max(|| b_i - A x_i ||_1) 6.383292e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.021155e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.612335e-15 max(|| b_i - A x_i ||_1) 6.383292e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.021155e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.612335e-15 max(|| b_i - A x_i ||_1) 6.383292e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.021155e-03 (SUCCESS) Start 2654: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtend 2391/3626 Test #2655: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpilu0 ...............***Timeout 505.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.079815e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.306852e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.142394e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.481784e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.795663e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.400489e-01 s Time to initialize coeftab 3.514410e-01 s Time to factorize 2.749121e+00 s ( 1.84 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.079539e-01 s - iteration 1 : total iteration time 1.05 s error 1.083e-14 Time for refinement 2.340353e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.083123e-14 max(|| b_i - A x_i ||_1) 1.691224e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.125169e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.083123e-14 max(|| b_i - A x_i ||_1) 1.691224e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.125169e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.083123e-14 max(|| b_i - A x_i ||_1) 1.691224e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.125169e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.083123e-14 max(|| b_i - A x_i ||_1) 1.691224e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.125169e-02 (SUCCESS) Start 2655: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpilu0 2391/3626 Test #2658: mpi_dst_example_simple_lap_d_facto1_sched1_not_svdend ...................***Timeout 505.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.398651e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.488370e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.444165e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.192819e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.553842e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.064996e-01 s Time to initialize coeftab 2.030247e-01 s Time to factorize 1.668580e+00 s ( 3.14 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.144955e+00 s - iteration 1 : total iteration time 0.665 s error 2.5046e-15 Time for refinement 3.445808e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.508892e-15 max(|| b_i - A x_i ||_1) 3.361636e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.224184e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.508892e-15 max(|| b_i - A x_i ||_1) 3.361636e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.224184e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.508892e-15 max(|| b_i - A x_i ||_1) 3.361636e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.224184e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.508892e-15 max(|| b_i - A x_i ||_1) 3.361636e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.224184e-03 (SUCCESS) Start 2658: mpi_dst_example_simple_lap_d_facto1_sched1_not_svdend 2391/3626 Test #2659: mpi_dst_example_simple_lap_d_facto1_sched1_kway_svdbegin ................***Timeout 505.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.383936e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.085499e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.407567e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.494799e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.930711e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.382682e-01 s Time to initialize coeftab 5.075235e-01 s Time to factorize 8.439672e+00 s (634.99 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.896443e+01 s - iteration 1 : total iteration time 8.09 s error 1.9647e-14 Time for refinement 2.609479e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.965460e-14 max(|| b_i - A x_i ||_1) 3.871360e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.864697e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.965460e-14 max(|| b_i - A x_i ||_1) 3.871360e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.864697e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.965460e-14 max(|| b_i - A x_i ||_1) 3.871360e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.864697e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.965460e-14 max(|| b_i - A x_i ||_1) 3.871360e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.864697e-02 (SUCCESS) Start 2659: mpi_dst_example_simple_lap_d_facto1_sched1_kway_svdbegin 2391/3626 Test #2660: mpi_dst_example_simple_lap_d_facto1_sched1_kway_svdend ..................***Timeout 505.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.887420e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.132051e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.552061e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.043625e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.090996e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.880337e-01 s Time to initialize coeftab 8.894374e-02 s Time to factorize 7.831011e+00 s (684.34 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.958437e+01 s - iteration 1 : total iteration time 10.8 s error 1.8531e-15 Time for refinement 4.043542e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.860222e-15 max(|| b_i - A x_i ||_1) 2.266696e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.848299e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.860222e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.860222e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.860222e-15 max(|| b_i - A x_i ||_1) 2.266696e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.848299e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.266696e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.848299e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.266696e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.848299e-03 (SUCCESS) Start 2660: mpi_dst_example_simple_lap_d_facto1_sched1_kway_svdend 2391/3626 Test #2662: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_svdend .......***Timeout 505.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.091969e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.882511e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.368356e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.279193e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.309794e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.267235e-01 s Time to initialize coeftab 4.632064e-01 s Time to factorize 1.380136e+00 s ( 3.79 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.664446e-01 s - iteration 1 : total iteration time 1.99 s error 1.2807e-15 Time for refinement 3.296563e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.283090e-15 max(|| b_i - A x_i ||_1) 1.652202e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.076134e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.283090e-15 max(|| b_i - A x_i ||_1) 1.652202e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.076134e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.283090e-15 max(|| b_i - A x_i ||_1) 1.652202e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.076134e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.283090e-15 max(|| b_i - A x_i ||_1) 1.652202e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.076134e-03 (SUCCESS) Start 2662: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_svdend 2391/3626 Test #2663: mpi_dst_example_simple_lap_d_facto1_sched1_not_pqrcpbegin ...............***Timeout 505.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.193661e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.781557e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.026360e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.263840e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.474494e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.422449e-01 s Time to initialize coeftab 1.308040e+00 s Time to factorize 3.640987e+00 s ( 1.44 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.512911e+01 s - iteration 1 : total iteration time 8.87 s error 2.3356e-14 Time for refinement 1.882435e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.335105e-14 max(|| b_i - A x_i ||_1) 3.785921e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.757335e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.335105e-14 max(|| b_i - A x_i ||_1) 3.785921e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.757335e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.335105e-14 max(|| b_i - A x_i ||_1) 3.785921e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.757335e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.335105e-14 max(|| b_i - A x_i ||_1) 3.785921e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.757335e-02 (SUCCESS) Start 2663: mpi_dst_example_simple_lap_d_facto1_sched1_not_pqrcpbegin 2391/3626 Test #2667: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpbegin ...***Timeout 505.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.979904e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.267263e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.579407e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.611520e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.077618e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.442849e-01 s Time to initialize coeftab 5.761983e-01 s Time to factorize 5.303435e+00 s (1010.49 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.066675e+01 s - iteration 1 : total iteration time 9.16 s error 1.8553e-14 Time for refinement 2.502168e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.854994e-14 max(|| b_i - A x_i ||_1) 3.471449e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.362173e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.854994e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.854994e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.854994e-14 max(|| b_i - A x_i ||_1) 3.471449e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.362173e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.471449e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.362173e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.471449e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.362173e-02 (SUCCESS) Start 2667: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpbegin 2391/3626 Test #2670: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrcpend .................***Timeout 505.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.568865e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.878647e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.066832e+01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.599842e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.080822e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.118928e-01 s Time to initialize coeftab 1.337333e-01 s Time to factorize 2.029938e+00 s ( 2.58 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.735829e+00 s - iteration 1 : total iteration time 40.2 s error 4.285e-16 Time for refinement 5.475959e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.454138e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.454138e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.454138e-16 max(|| b_i - A x_i ||_1) 9.457188e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.188377e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 9.457188e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.188377e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.454138e-16 max(|| b_i - A x_i ||_1) 9.457188e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.188377e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 9.457188e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.188377e-03 (SUCCESS) Start 2670: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrcpend 2391/3626 Test #2671: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrcpbegin ..............***Timeout 505.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.136368e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.850650e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.528822e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.055204e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.310940e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.132267e-02 s Time to initialize coeftab 6.192746e-01 s Time to factorize 2.150348e+00 s ( 2.43 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.481049e-01 s - iteration 1 : total iteration time 1.83 s error 4.4411e-14 Time for refinement 3.299927e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.440844e-14 max(|| b_i - A x_i ||_1) 8.144120e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.023379e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.440844e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.440844e-14 max(|| b_i - A x_i ||_1) 8.144120e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.023379e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 8.144120e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.023379e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.440844e-14 max(|| b_i - A x_i ||_1) 8.144120e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.023379e-01 (SUCCESS) Start 2671: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrcpbegin 2391/3626 Test #2674: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpend .....***Timeout 505.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.317829e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.237771e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.971392e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.358012e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.113697e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.274424e-01 s Time to initialize coeftab 6.333930e-02 s Time to factorize 2.661788e+00 s ( 1.97 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.773479e+01 s - iteration 1 : total iteration time 7.59 s error 7.5382e-15 Time for refinement 4.422555e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.540796e-15 max(|| b_i - A x_i ||_1) 7.599392e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.549289e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.540796e-15 max(|| b_i - A x_i ||_1) 7.599392e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.549289e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.540796e-15 max(|| b_i - A x_i ||_1) 7.599392e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.549289e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.540796e-15 max(|| b_i - A x_i ||_1) 7.599392e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.549289e-03 (SUCCESS) Start 2674: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpend 2391/3626 Test #2678: mpi_dst_example_simple_lap_d_facto1_sched1_kway_tqrcpend ................***Timeout 505.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.505263e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.360994e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.900402e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.064667e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.900313e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.510454e-01 s Time to initialize coeftab 8.810880e-02 s Time to factorize 2.597134e+00 s ( 2.02 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.144226e+00 s - iteration 1 : total iteration time 8.38 s error 2.9039e-15 Time for refinement 5.668038e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.904366e-15 max(|| b_i - A x_i ||_1) 2.672826e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.358636e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.904366e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.904366e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.904366e-15 max(|| b_i - A x_i ||_1) 2.672826e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.358636e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.672826e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.358636e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.672826e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.358636e-03 (SUCCESS) Start 2678: mpi_dst_example_simple_lap_d_facto1_sched1_kway_tqrcpend 2391/3626 Test #2684: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrrtend ................***Timeout 505.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.909919e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.947804e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.223196e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.985420e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.087725e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.062307e-01 s Time to initialize coeftab 2.355190e+01 s Time to factorize 4.908490e+00 s ( 1.07 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.004373e+00 s - iteration 1 : total iteration time 9.49 s error 3.9783e-14 Time for refinement 1.809937e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.978200e-14 max(|| b_i - A x_i ||_1) 3.061364e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.846868e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.978200e-14 max(|| b_i - A x_i ||_1) 3.061364e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.846868e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.978200e-14 max(|| b_i - A x_i ||_1) 3.061364e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.846868e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.978200e-14 max(|| b_i - A x_i ||_1) 3.061364e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.846868e-02 (SUCCESS) Start 2684: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrrtend 2391/3626 Test #2685: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtbegin ...***Timeout 505.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.085589e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.733167e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.003947e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.819829e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.834759e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.883184e-01 s Time to initialize coeftab 3.814928e-01 s Time to factorize 3.661139e+00 s ( 1.43 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.390606e+00 s - iteration 1 : total iteration time 30.8 s error 6.1796e-13 Time for refinement 5.887918e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.179563e-13 max(|| b_i - A x_i ||_1) 1.214389e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.525984e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.179563e-13 max(|| b_i - A x_i ||_1) 1.214389e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.525984e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.179563e-13 max(|| b_i - A x_i ||_1) 1.214389e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.525984e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.179563e-13 max(|| b_i - A x_i ||_1) 1.214389e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.525984e+00 (SUCCESS) Start 2685: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtbegin 2391/3626 Test #2686: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtend .....***Timeout 505.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.050828e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.412206e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.677393e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.356464e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.233265e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.157132e-01 s Time to initialize coeftab 5.829952e-01 s Time to factorize 5.609942e-01 s ( 9.33 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.299721e+00 s - iteration 1 : total iteration time 1.84 s error 2.3503e-15 Time for refinement 3.629294e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.353294e-15 max(|| b_i - A x_i ||_1) 2.154725e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.707597e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.353294e-15 max(|| b_i - A x_i ||_1) 2.154725e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.707597e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.353294e-15 max(|| b_i - A x_i ||_1) 2.154725e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.707597e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.353294e-15 max(|| b_i - A x_i ||_1) 2.154725e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.707597e-03 (SUCCESS) Start 2686: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtend 2391/3626 Test #2687: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpilu0 ...............***Timeout 505.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.665160e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.875170e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.802680e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.422241e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.898665e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.036169e-01 s Time to initialize coeftab 2.908026e-01 s Time to factorize 7.256484e-01 s ( 7.21 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.430209e+00 s - iteration 1 : total iteration time 1.29 s error 9.6169e-15 Time for refinement 3.369212e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.623199e-15 max(|| b_i - A x_i ||_1) 1.651885e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.075735e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.623199e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.623199e-15 max(|| b_i - A x_i ||_1) 1.651885e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.075735e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.651885e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.075735e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.623199e-15 max(|| b_i - A x_i ||_1) 1.651885e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.075735e-02 (SUCCESS) Start 2687: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpilu0 2391/3626 Test #2688: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpilu1 ...............***Timeout 505.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.166926e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.866142e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.452162e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.036313e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.445156e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.342531e-01 s Time to initialize coeftab 1.703760e-01 s Time to factorize 6.489277e+00 s (825.84 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.727915e-01 s - iteration 1 : total iteration time 1.95 s error 8.2793e-15 Time for refinement 1.268852e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.281800e-15 max(|| b_i - A x_i ||_1) 1.306485e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.641710e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.281800e-15 max(|| b_i - A x_i ||_1) 1.306485e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.641710e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.281800e-15 max(|| b_i - A x_i ||_1) 1.306485e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.641710e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.281800e-15 max(|| b_i - A x_i ||_1) 1.306485e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.641710e-02 (SUCCESS) Start 2688: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpilu1 2391/3626 Test #2689: mpi_dst_example_simple_lap_d_facto2_sched1_not_svdbegin .................***Timeout 505.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.214461e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.829092e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.219605e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.636611e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.247762e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.975314e-01 s Time to initialize coeftab 2.441050e+00 s Time to factorize 5.698921e+00 s ( 1.75 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 8.777231e+00 s - iteration 1 : total iteration time 7.79 s error 1.9907e-14 Time for refinement 4.452460e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.990010e-14 max(|| b_i - A x_i ||_1) 4.010447e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.039472e-02 (SUCCESS) || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.990010e-14 max(|| b_i - A x_i ||_1) 4.010447e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.039472e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.990010e-14 max(|| b_i - A x_i ||_1) 4.010447e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.039472e-02 (SUCCESS) max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.990010e-14 max(|| b_i - A x_i ||_1) 4.010447e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.039472e-02 (SUCCESS) Start 2689: mpi_dst_example_simple_lap_d_facto2_sched1_not_svdbegin 2391/3626 Test #2691: mpi_dst_example_simple_lap_d_facto2_sched1_kway_svdbegin ................***Timeout 505.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.310248e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.032225e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.240797e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.585845e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.887529e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.656528e-01 s Time to initialize coeftab 2.071016e+00 s Time to factorize 7.449094e+00 s ( 1.34 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.445521e+01 s - iteration 1 : total iteration time 7.08 s error 1.8627e-14 Time for refinement 1.700394e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.863063e-14 max(|| b_i - A x_i ||_1) 3.889375e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.887334e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.863063e-14 max(|| b_i - A x_i ||_1) 3.889375e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.887334e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.863063e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.863063e-14 max(|| b_i - A x_i ||_1) 3.889375e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.887334e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.889375e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.887334e-02 (SUCCESS) Start 2691: mpi_dst_example_simple_lap_d_facto2_sched1_kway_svdbegin 2391/3626 Test #2695: mpi_dst_example_simple_lap_d_facto2_sched1_not_pqrcpbegin ...............***Timeout 505.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.157914e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.441825e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.100714e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.487912e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.386716e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.076629e-01 s Time to initialize coeftab 2.641837e-01 s Time to factorize 6.306228e+00 s ( 1.58 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.285792e+01 s - iteration 1 : total iteration time 6.69 s error 1.8178e-14 Time for refinement 2.431131e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.817935e-14 max(|| b_i - A x_i ||_1) 3.484283e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.378301e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.817935e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.817935e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.817935e-14 max(|| b_i - A x_i ||_1) 3.484283e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.378301e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.484283e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.378301e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.484283e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.378301e-02 (SUCCESS) Start 2695: mpi_dst_example_simple_lap_d_facto2_sched1_not_pqrcpbegin 2391/3626 Test #2696: mpi_dst_example_simple_lap_d_facto2_sched1_not_pqrcpend .................***Timeout 505.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.225475e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.265389e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.591750e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.211633e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.625762e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.805288e-01 s Time to initialize coeftab 8.548237e-02 s Time to factorize 1.926848e+00 s ( 5.18 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 9.740150e-01 s - iteration 1 : total iteration time 8.44 s error 4.1837e-15 Time for refinement 5.480604e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.184441e-15 max(|| b_i - A x_i ||_1) 5.274384e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.627717e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.184441e-15 max(|| b_i - A x_i ||_1) 5.274384e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.627717e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.184441e-15 max(|| b_i - A x_i ||_1) 5.274384e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.627717e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.184441e-15 max(|| b_i - A x_i ||_1) 5.274384e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.627717e-03 (SUCCESS) Start 2696: mpi_dst_example_simple_lap_d_facto2_sched1_not_pqrcpend 2391/3626 Test #2697: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpbegin ..............***Timeout 505.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.974910e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.451975e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.910401e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.324788e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.388992e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.699404e-02 s Time to initialize coeftab 3.326493e+00 s Time to factorize 8.160691e+00 s ( 1.22 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 6.503235e+00 s - iteration 1 : total iteration time 15.8 s error 1.8283e-14 Time for refinement 1.907192e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.828665e-14 max(|| b_i - A x_i ||_1) 3.237959e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.068773e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.828665e-14 max(|| b_i - A x_i ||_1) 3.237959e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.068773e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.828665e-14 max(|| b_i - A x_i ||_1) 3.237959e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.068773e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.828665e-14 max(|| b_i - A x_i ||_1) 3.237959e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.068773e-02 (SUCCESS) Start 2697: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpbegin 2391/3626 Test #2700: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpend .....***Timeout 505.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.298344e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.344487e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.113268e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.606004e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.532493e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.684595e-01 s Time to initialize coeftab 8.301759e-02 s Time to factorize 1.912269e+00 s ( 5.22 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 9.312629e-01 s - iteration 1 : total iteration time 1.19 s error 4.1804e-15 Time for refinement 2.788875e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.173394e-15 max(|| b_i - A x_i ||_1) 4.200497e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.278286e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.173394e-15 max(|| b_i - A x_i ||_1) 4.200497e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.278286e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.173394e-15 max(|| b_i - A x_i ||_1) 4.200497e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.278286e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.173394e-15 max(|| b_i - A x_i ||_1) 4.200497e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.278286e-03 (SUCCESS) Start 2700: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpend 2391/3626 Test #2703: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrcpbegin ..............***Timeout 505.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.910316e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.135919e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.064523e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.724566e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.031725e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.934771e-01 s Time to initialize coeftab 7.311612e-01 s Time to factorize 1.022897e+01 s (999.53 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Time to solve 2.782428e+01 s - iteration 1 : total iteration time 10.4 s error 1.1642e-13 Time for refinement 2.703887e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.164207e-13 max(|| b_i - A x_i ||_1) 1.410538e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.772462e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.164207e-13 max(|| b_i - A x_i ||_1) 1.410538e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.772462e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.164207e-13 max(|| b_i - A x_i ||_1) 1.410538e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.772462e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.164207e-13 max(|| b_i - A x_i ||_1) 1.410538e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.772462e-01 (SUCCESS) Start 2703: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrcpbegin Test #2499: mpi_dst_example_simple_lap_z_facto4_sched0_not_svdend ...................***Timeout 488.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.441335e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.416621e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.289955e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.134901e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.507199e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.087448e-01 s Time to initialize coeftab 1.144488e-01 s Time to factorize 2.726583e+00 s ( 7.81 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.013934e-01 s - iteration 1 : total iteration time 0.046 s error 3.1668e-16 Time for refinement 1.496156e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.429057e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.429057e-16 max(|| b_i - A x_i ||_1) 8.353941e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.107983e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.353941e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.107983e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.429057e-16 max(|| b_i - A x_i ||_1) 8.353941e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.107983e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.429057e-16 max(|| b_i - A x_i ||_1) 8.353941e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.107983e-03 (SUCCESS) Test #2579: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpend .....***Timeout 488.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.993809e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.730400e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.137175e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.212600e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.141672e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.504411e-01 s Time to initialize coeftab 8.857985e-02 s Time to factorize 4.880176e+00 s ( 1.07 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.013473e+00 s Time for refinement 1.132218e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.540347e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.540347e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.540347e-07 max(|| b_i - A x_i ||_1) 1.328392e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.669245e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.540347e-07 max(|| b_i - A x_i ||_1) 1.328392e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.669245e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.328392e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.669245e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.328392e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.669245e+00 (SUCCESS) Start 2579: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpend Test #2580: mpi_dst_example_simple_lap_s_facto1_sched1_not_tqrcpbegin ...............***Timeout 488.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.625767e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.778285e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.354462e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.304851e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.912193e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.389823e-01 s Time to initialize coeftab 2.942254e-01 s Time to factorize 9.995999e+00 s (536.12 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.1 Ko / 44.3 Ko ------------------------------------------------ Total 68.3 Ko / 68.5 Ko Time to solve 1.103342e+01 s - iteration 1 : total iteration time 8.78 s error 5.527e-11 Time for refinement 4.389013e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.776284e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.776284e-08 max(|| b_i - A x_i ||_1) 2.860219e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.594123e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.776284e-08 max(|| b_i - A x_i ||_1) 2.860219e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.594123e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.776284e-08 max(|| b_i - A x_i ||_1) 2.860219e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.594123e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.860219e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.594123e-01 (SUCCESS) Start 2580: mpi_dst_example_simple_lap_s_facto1_sched1_not_tqrcpbegin Test #2581: mpi_dst_example_simple_lap_s_facto1_sched1_not_tqrcpend .................***Timeout 488.45 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.131125e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.191370e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.442066e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.175404e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.197775e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.178373e-01 s Time to initialize coeftab 1.406852e+00 s Time to factorize 1.076331e+00 s ( 4.86 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.787567e+00 s Time for refinement 1.635425e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.226912e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.226912e-07 max(|| b_i - A x_i ||_1) 1.201589e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.509905e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.226912e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.226912e-07 max(|| b_i - A x_i ||_1) 1.201589e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.509905e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.201589e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.509905e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.201589e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.509905e+00 (SUCCESS) Start 2581: mpi_dst_example_simple_lap_s_facto1_sched1_not_tqrcpend Test #2583: mpi_dst_example_simple_lap_s_facto1_sched1_kway_tqrcpend ................***Timeout 488.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.653216e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.418035e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.273696e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.263907e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.005962e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.294277e-01 s Time to initialize coeftab 1.619470e-01 s Time to factorize 3.837573e+00 s ( 1.36 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.403833e+00 s Time for refinement 4.147432e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.003472e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.003472e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.003472e-07 max(|| b_i - A x_i ||_1) 1.023526e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.286153e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.003472e-07 max(|| b_i - A x_i ||_1) 1.023526e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.286153e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.023526e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.286153e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.023526e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.286153e+00 (SUCCESS) Start 2583: mpi_dst_example_simple_lap_s_facto1_sched1_kway_tqrcpend Test #2584: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpbegin ...***Timeout 488.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.580139e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.752077e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.501583e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.734011e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.944140e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.893491e-03 s Time to initialize coeftab 6.126799e-01 s Time to factorize 1.251114e+00 s ( 4.18 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 2.958160e+00 s - iteration 1 : total iteration time 6.45 s error 2.5045e-11 Time for refinement 1.421656e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.727330e-08 max(|| b_i - A x_i ||_1) 2.888561e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629737e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.727330e-08 max(|| b_i - A x_i ||_1) 2.888561e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629737e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.727330e-08 max(|| b_i - A x_i ||_1) 2.888561e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629737e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.727330e-08 max(|| b_i - A x_i ||_1) 2.888561e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629737e-01 (SUCCESS) Start 2584: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpbegin Test #2587: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrrtend .................***Timeout 488.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.268341e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.770367e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.422597e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.908419e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.419653e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.632848e-01 s Time to initialize coeftab 5.621355e-02 s Time to factorize 7.100157e-01 s ( 7.37 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.041497e+00 s Time for refinement 6.819024e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.733746e-07 max(|| b_i - A x_i ||_1) 1.152814e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.448614e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.733746e-07 max(|| b_i - A x_i ||_1) 1.152814e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.448614e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.733746e-07 max(|| b_i - A x_i ||_1) 1.152814e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.448614e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.733746e-07 max(|| b_i - A x_i ||_1) 1.152814e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.448614e+00 (SUCCESS) Start 2587: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrrtend Test #2594: mpi_dst_example_simple_lap_s_facto2_sched1_not_svdend ...................***Timeout 488.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.009518e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.709020e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.999258e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.112486e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.738573e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.042947e-01 s Time to initialize coeftab 2.599643e-01 s Time to factorize 6.201649e+00 s ( 1.61 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 3.016380e+01 s Time for refinement 1.271085e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.796739e-07 max(|| b_i - A x_i ||_1) 7.515581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.444007e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.796739e-07 max(|| b_i - A x_i ||_1) 7.515581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.444007e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.796739e-07 max(|| b_i - A x_i ||_1) 7.515581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.444007e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.796739e-07 max(|| b_i - A x_i ||_1) 7.515581e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.444007e-01 (SUCCESS) Start 2594: mpi_dst_example_simple_lap_s_facto2_sched1_not_svdend Test #2598: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_svdend .......***Timeout 488.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.401510e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.741642e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.474094e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.185036e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.553581e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.215096e-01 s Time to initialize coeftab 2.096925e-01 s Time to factorize 4.950663e+00 s ( 2.02 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 5.182322e-01 s Time for refinement 9.026232e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.717137e-07 max(|| b_i - A x_i ||_1) 7.339782e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.223099e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.717137e-07 max(|| b_i - A x_i ||_1) 7.339782e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.223099e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.717137e-07 max(|| b_i - A x_i ||_1) 7.339782e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.223099e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.717137e-07 max(|| b_i - A x_i ||_1) 7.339782e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.223099e-01 (SUCCESS) Start 2598: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_svdend Test #2599: mpi_dst_example_simple_lap_s_facto2_sched1_not_pqrcpbegin ...............***Timeout 488.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.231927e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.693312e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.777408e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.965514e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.269777e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.143672e-01 s Time to initialize coeftab 8.179238e-01 s Time to factorize 4.510281e+00 s ( 2.21 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.741486e+01 s - iteration 1 : total iteration time 8.03 s error 9.0133e-11 Time for refinement 3.783502e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.251457e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.251457e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.251457e-08 max(|| b_i - A x_i ||_1) 2.970582e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.732804e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.251457e-08 max(|| b_i - A x_i ||_1) 2.970582e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.732804e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.970582e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.732804e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.970582e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.732804e-01 (SUCCESS) Start 2599: mpi_dst_example_simple_lap_s_facto2_sched1_not_pqrcpbegin Test #2600: mpi_dst_example_simple_lap_s_facto2_sched1_not_pqrcpend .................***Timeout 488.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.884337e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.471433e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.578349e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.417423e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.434622e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.955878e-01 s Time to initialize coeftab 4.070056e-01 s Time to factorize 3.618765e+00 s ( 2.76 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.326249e+00 s - iteration 1 : total iteration time 7.73 s error 8.2398e-13 Time for refinement 5.292490e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.547292e-08 max(|| b_i - A x_i ||_1) 2.728840e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.429033e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.547292e-08 max(|| b_i - A x_i ||_1) 2.728840e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.429033e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.547292e-08 max(|| b_i - A x_i ||_1) 2.728840e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.429033e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.547292e-08 max(|| b_i - A x_i ||_1) 2.728840e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.429033e-01 (SUCCESS) Start 2600: mpi_dst_example_simple_lap_s_facto2_sched1_not_pqrcpend Test #2602: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpend ................***Timeout 488.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.420350e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.598935e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.795636e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.621003e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.857355e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.216115e-01 s Time to initialize coeftab 2.139260e-01 s Time to factorize 7.488248e+00 s ( 1.33 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.430481e+01 s - iteration 1 : total iteration time 7.09 s error 4.5769e-12 Time for refinement 1.977721e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.760849e-08 max(|| b_i - A x_i ||_1) 2.794221e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.511191e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.760849e-08 max(|| b_i - A x_i ||_1) 2.794221e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.511191e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.760849e-08 max(|| b_i - A x_i ||_1) 2.794221e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.511191e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.760849e-08 max(|| b_i - A x_i ||_1) 2.794221e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.511191e-01 (SUCCESS) Start 2602: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpend Test #2603: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpbegin ...***Timeout 488.46 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.140746e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.233363e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.265182e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.709795e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.231978e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.649687e-01 s Time to initialize coeftab 1.523960e-01 s Time to factorize 4.352526e+00 s ( 2.29 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 5.523686e-01 s - iteration 1 : total iteration time 1.22 s error 3.7451e-11 Time for refinement 1.051159e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.113547e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.113547e-08 max(|| b_i - A x_i ||_1) 2.951055e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.708267e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.113547e-08 max(|| b_i - A x_i ||_1) 2.951055e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.708267e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.113547e-08 max(|| b_i - A x_i ||_1) 2.951055e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.708267e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.951055e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.708267e-01 (SUCCESS) Start 2603: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpbegin Test #2604: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpend .....***Timeout 488.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.120465e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.331195e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.146263e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.470156e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.341097e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.551803e+00 s Time to initialize coeftab 8.614665e-01 s Time to factorize 1.863005e+00 s ( 5.36 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.856113e+00 s - iteration 1 : total iteration time 10.6 s error 8.5092e-13 Time for refinement 5.465629e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.303544e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.303544e-08 max(|| b_i - A x_i ||_1) 2.633866e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.309690e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.303544e-08 max(|| b_i - A x_i ||_1) 2.633866e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.309690e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.303544e-08 max(|| b_i - A x_i ||_1) 2.633866e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.309690e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.633866e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.309690e-01 (SUCCESS) Start 2604: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpend Test #2605: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrcpbegin ...............***Timeout 488.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.013008e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.734538e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.587059e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.812978e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.430393e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.422207e-01 s Time to initialize coeftab 4.469930e-01 s Time to factorize 7.938565e+00 s ( 1.26 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 3.234737e+01 s - iteration 1 : total iteration time 7.51 s error 4.3385e-11 Time for refinement 2.246196e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.967541e-08 max(|| b_i - A x_i ||_1) 2.925372e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.675994e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.967541e-08 max(|| b_i - A x_i ||_1) 2.925372e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.675994e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.967541e-08 max(|| b_i - A x_i ||_1) 2.925372e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.675994e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.967541e-08 max(|| b_i - A x_i ||_1) 2.925372e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.675994e-01 (SUCCESS) Start 2605: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrcpbegin 2392/3626 Test #2708: mpi_dst_example_simple_lap_d_facto2_sched1_not_tqrcpend .................***Timeout 441.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.577432e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.747631e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.059434e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.599172e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.917274e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.183332e-01 s Time to initialize coeftab 1.362435e-01 s Time to factorize 4.138990e+00 s ( 2.41 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 4.438785e-01 s - iteration 1 : total iteration time 37.7 s error 1.8994e-15 Time for refinement 5.518175e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.902235e-15 max(|| b_i - A x_i ||_1) 2.376637e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.986448e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.902235e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.902235e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.902235e-15 max(|| b_i - A x_i ||_1) 2.376637e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.986448e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.376637e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.986448e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.376637e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.986448e-03 (SUCCESS) Start 2708: mpi_dst_example_simple_lap_d_facto2_sched1_not_tqrcpend 2392/3626 Test #2709: mpi_dst_example_simple_lap_d_facto2_sched1_kway_tqrcpbegin ..............***Timeout 434.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.542088e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.501238e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.496632e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.182695e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.106131e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.594635e-01 s Time to initialize coeftab 5.639000e-01 s Time to factorize 7.919966e+00 s ( 1.26 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Time to solve 3.123476e+01 s - iteration 1 : total iteration time 8 s error 3.1908e-14 Time for refinement 1.862886e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.190430e-14 max(|| b_i - A x_i ||_1) 6.457551e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.114468e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.190430e-14 max(|| b_i - A x_i ||_1) 6.457551e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.114468e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.190430e-14 max(|| b_i - A x_i ||_1) 6.457551e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.114468e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.190430e-14 max(|| b_i - A x_i ||_1) 6.457551e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.114468e-02 (SUCCESS) Start 2709: mpi_dst_example_simple_lap_d_facto2_sched1_kway_tqrcpbegin 2392/3626 Test #2710: mpi_dst_example_simple_lap_d_facto2_sched1_kway_tqrcpend ................***Timeout 431.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.547113e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.208066e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.061778e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.515458e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.781509e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.931864e-02 s Time to initialize coeftab 8.788362e-01 s Time to factorize 8.209228e-01 s (12.16 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.971945e+00 s - iteration 1 : total iteration time 2.13 s error 3.7474e-15 Time for refinement 6.569974e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.748253e-15 max(|| b_i - A x_i ||_1) 3.577359e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.495259e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.748253e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.748253e-15 max(|| b_i - A x_i ||_1) 3.577359e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.495259e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 3.577359e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.495259e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.748253e-15 max(|| b_i - A x_i ||_1) 3.577359e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.495259e-03 (SUCCESS) Start 2710: mpi_dst_example_simple_lap_d_facto2_sched1_kway_tqrcpend 2392/3626 Test #2712: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpend .....***Timeout 368.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.867729e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.751341e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.322062e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.528527e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.064731e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.682852e-01 s Time to initialize coeftab 6.483379e-01 s Time to factorize 8.051740e-01 s (12.40 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.240008e+00 s - iteration 1 : total iteration time 1.4 s error 4.9987e-15 Time for refinement 4.320001e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.001179e-15 max(|| b_i - A x_i ||_1) 4.953178e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.224093e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.001179e-15 max(|| b_i - A x_i ||_1) 4.953178e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.224093e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.001179e-15 max(|| b_i - A x_i ||_1) 4.953178e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.224093e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.001179e-15 max(|| b_i - A x_i ||_1) 4.953178e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.224093e-03 (SUCCESS) Start 2712: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpend Start 2714: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrrtend Start 2715: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrrtbegin Start 2716: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrrtend Start 2717: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtbegin Start 2718: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtend Start 2719: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpilu0 Start 2720: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpilu1 Start 2721: mpi_dst_example_simple_lap_c_facto0_sched1_not_svdbegin Start 2722: mpi_dst_example_simple_lap_c_facto0_sched1_not_svdend Start 2723: mpi_dst_example_simple_lap_c_facto0_sched1_kway_svdbegin Start 2724: mpi_dst_example_simple_lap_c_facto0_sched1_kway_svdend Start 2725: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_svdbegin Start 2726: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_svdend Start 2727: mpi_dst_example_simple_lap_c_facto0_sched1_not_pqrcpbegin Start 2728: mpi_dst_example_simple_lap_c_facto0_sched1_not_pqrcpend Start 2729: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpbegin Start 2730: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpend Start 2731: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpbegin Start 2732: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpend Start 2733: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrcpbegin Start 2734: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrcpend Start 2735: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrcpbegin Start 2736: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrcpend Start 2737: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrcpbegin Start 2738: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrcpend Start 2739: mpi_dst_example_simple_lap_c_facto0_sched1_not_tqrcpbegin Start 2740: mpi_dst_example_simple_lap_c_facto0_sched1_not_tqrcpend Start 2741: mpi_dst_example_simple_lap_c_facto0_sched1_kway_tqrcpbegin Start 2742: mpi_dst_example_simple_lap_c_facto0_sched1_kway_tqrcpend Start 2743: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpbegin Start 2744: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpend Start 2745: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrrtbegin Start 2746: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrrtend Start 2747: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrrtbegin Start 2748: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrrtend Start 2749: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtbegin Start 2750: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtend Start 2751: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpilu0 Start 2752: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpilu1 Start 2753: mpi_dst_example_simple_lap_c_facto1_sched1_not_svdbegin Start 2754: mpi_dst_example_simple_lap_c_facto1_sched1_not_svdend Start 2755: mpi_dst_example_simple_lap_c_facto1_sched1_kway_svdbegin Start 2756: mpi_dst_example_simple_lap_c_facto1_sched1_kway_svdend Start 2757: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_svdbegin Start 2758: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_svdend Start 2759: mpi_dst_example_simple_lap_c_facto1_sched1_not_pqrcpbegin Start 2760: mpi_dst_example_simple_lap_c_facto1_sched1_not_pqrcpend Start 2761: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpbegin Start 2762: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpend Start 2763: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpbegin Start 2764: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpend Start 2765: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrcpbegin Start 2766: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrcpend Start 2767: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrcpbegin Start 2768: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrcpend Start 2769: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrcpbegin Start 2770: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrcpend Start 2771: mpi_dst_example_simple_lap_c_facto1_sched1_not_tqrcpbegin Start 2772: mpi_dst_example_simple_lap_c_facto1_sched1_not_tqrcpend Start 2773: mpi_dst_example_simple_lap_c_facto1_sched1_kway_tqrcpbegin Start 2774: mpi_dst_example_simple_lap_c_facto1_sched1_kway_tqrcpend Start 2775: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_tqrcpbegin Start 2776: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_tqrcpend Start 2777: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrrtbegin Start 2778: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrrtend Start 2779: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrrtbegin Start 2780: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrrtend Start 2781: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrrtbegin Start 2782: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrrtend Start 2783: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpilu0 Start 2784: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpilu1 Start 2785: mpi_dst_example_simple_lap_c_facto2_sched1_not_svdbegin Start 2786: mpi_dst_example_simple_lap_c_facto2_sched1_not_svdend Start 2787: mpi_dst_example_simple_lap_c_facto2_sched1_kway_svdbegin Start 2788: mpi_dst_example_simple_lap_c_facto2_sched1_kway_svdend Start 2789: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_svdbegin Start 2790: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_svdend Start 2791: mpi_dst_example_simple_lap_c_facto2_sched1_not_pqrcpbegin Start 2792: mpi_dst_example_simple_lap_c_facto2_sched1_not_pqrcpend Start 2793: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpbegin Start 2794: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpend Start 2795: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpbegin Start 2796: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpend Start 2797: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrcpbegin Start 2798: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrcpend Start 2799: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrcpbegin Start 2800: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrcpend Start 2801: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpbegin Start 2802: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpend Start 2803: mpi_dst_example_simple_lap_c_facto2_sched1_not_tqrcpbegin Start 2804: mpi_dst_example_simple_lap_c_facto2_sched1_not_tqrcpend Start 2805: mpi_dst_example_simple_lap_c_facto2_sched1_kway_tqrcpbegin Start 2806: mpi_dst_example_simple_lap_c_facto2_sched1_kway_tqrcpend Start 2807: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpbegin Start 2808: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpend Start 2809: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrrtbegin Start 2810: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrrtend Start 2811: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrrtbegin Start 2812: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrrtend Start 2813: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtbegin Start 2814: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtend Start 2815: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpilu0 Start 2816: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpilu1 Start 2817: mpi_dst_example_simple_lap_c_facto3_sched1_not_svdbegin Start 2818: mpi_dst_example_simple_lap_c_facto3_sched1_not_svdend Start 2819: mpi_dst_example_simple_lap_c_facto3_sched1_kway_svdbegin Start 2820: mpi_dst_example_simple_lap_c_facto3_sched1_kway_svdend Start 2821: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_svdbegin Start 2822: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_svdend Start 2823: mpi_dst_example_simple_lap_c_facto3_sched1_not_pqrcpbegin Start 2824: mpi_dst_example_simple_lap_c_facto3_sched1_not_pqrcpend Start 2825: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpbegin Start 2826: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpend Start 2827: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpbegin Start 2828: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpend Start 2829: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrcpbegin Start 2830: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrcpend Start 2831: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrcpbegin Start 2832: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrcpend Start 2833: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpbegin Start 2834: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpend Start 2835: mpi_dst_example_simple_lap_c_facto3_sched1_not_tqrcpbegin Start 2836: mpi_dst_example_simple_lap_c_facto3_sched1_not_tqrcpend Start 2837: mpi_dst_example_simple_lap_c_facto3_sched1_kway_tqrcpbegin Start 2838: mpi_dst_example_simple_lap_c_facto3_sched1_kway_tqrcpend Start 2839: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpbegin Start 2840: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpend Start 2841: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrrtbegin Start 2842: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrrtend Start 2843: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrrtbegin Start 2844: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrrtend Start 2845: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtbegin Start 2846: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtend Start 2847: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpilu0 Start 2848: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpilu1 Start 2849: mpi_dst_example_simple_lap_c_facto4_sched1_not_svdbegin Start 2850: mpi_dst_example_simple_lap_c_facto4_sched1_not_svdend Start 2851: mpi_dst_example_simple_lap_c_facto4_sched1_kway_svdbegin Start 2852: mpi_dst_example_simple_lap_c_facto4_sched1_kway_svdend Start 2853: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_svdbegin Start 2854: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_svdend Start 2855: mpi_dst_example_simple_lap_c_facto4_sched1_not_pqrcpbegin Start 2856: mpi_dst_example_simple_lap_c_facto4_sched1_not_pqrcpend Start 2857: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpbegin Start 2858: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpend Start 2859: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpbegin Start 2860: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpend Start 2861: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrcpbegin Start 2862: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrcpend Start 2863: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrcpbegin Start 2864: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrcpend Start 2865: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpbegin Start 2866: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpend Start 2867: mpi_dst_example_simple_lap_c_facto4_sched1_not_tqrcpbegin Start 2868: mpi_dst_example_simple_lap_c_facto4_sched1_not_tqrcpend Start 2869: mpi_dst_example_simple_lap_c_facto4_sched1_kway_tqrcpbegin Start 2870: mpi_dst_example_simple_lap_c_facto4_sched1_kway_tqrcpend Start 2871: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpbegin Start 2872: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpend Start 2873: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrrtbegin Start 2874: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrrtend Start 2875: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrrtbegin Start 2876: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrrtend Start 2877: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtbegin Start 2878: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtend Start 2879: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpilu0 Start 2880: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpilu1 Start 2881: mpi_dst_example_simple_lap_z_facto0_sched1_not_svdbegin Start 2882: mpi_dst_example_simple_lap_z_facto0_sched1_not_svdend Start 2883: mpi_dst_example_simple_lap_z_facto0_sched1_kway_svdbegin Start 2884: mpi_dst_example_simple_lap_z_facto0_sched1_kway_svdend Start 2885: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_svdbegin Start 2886: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_svdend Start 2887: mpi_dst_example_simple_lap_z_facto0_sched1_not_pqrcpbegin Start 2888: mpi_dst_example_simple_lap_z_facto0_sched1_not_pqrcpend Start 2889: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpbegin Start 2890: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpend Start 2891: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpbegin Start 2892: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpend Start 2893: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrcpbegin Start 2894: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrcpend Start 2895: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrcpbegin Start 2896: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrcpend Start 2897: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpbegin Start 2898: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpend Start 2899: mpi_dst_example_simple_lap_z_facto0_sched1_not_tqrcpbegin Start 2900: mpi_dst_example_simple_lap_z_facto0_sched1_not_tqrcpend Start 2901: mpi_dst_example_simple_lap_z_facto0_sched1_kway_tqrcpbegin Start 2902: mpi_dst_example_simple_lap_z_facto0_sched1_kway_tqrcpend Start 2903: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpbegin Test #2530: mpi_dst_example_simple_lap_s_facto0_sched1_not_svdbegin .................***Timeout 507.98 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.199968e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.159205e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.154978e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.422011e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.579923e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.753212e-01 s Time to initialize coeftab 2.900299e-01 s Time to factorize 5.477773e+00 s (946.33 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Start 2530: mpi_dst_example_simple_lap_s_facto0_sched1_not_svdbegin Test #2550: mpi_dst_example_simple_lap_s_facto0_sched1_kway_tqrcpbegin ..............***Timeout 507.84 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.675771e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.738445e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.206216e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.986586e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.089988e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.456472e-01 s Time to initialize coeftab 3.679144e-01 s Time to factorize 4.558505e+00 s ( 1.11 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 7.658427e+00 s Start 2550: mpi_dst_example_simple_lap_s_facto0_sched1_kway_tqrcpbegin Test #2554: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrrtbegin ...............***Timeout 507.82 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.221125e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.887365e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.023622e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.209136e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.427867e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.429825e-01 s Time to initialize coeftab 5.175585e-01 s Time to factorize 4.945265e+00 s ( 1.02 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Start 2554: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrrtbegin Test #2561: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpilu1 ...............***Timeout 507.78 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.525177e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.110489e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.428916e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.576461e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.743837e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.311375e-01 s Time to initialize coeftab 1.666585e-01 s Time to factorize 3.014959e+00 s ( 1.68 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Start 2561: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpilu1 Test #2564: mpi_dst_example_simple_lap_s_facto1_sched1_kway_svdbegin ................***Timeout 507.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.397130e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.976936e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.557232e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.414467e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.584117e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.495238e-01 s Time to initialize coeftab 2.746268e-01 s Time to factorize 1.108858e+01 s (483.30 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.119138e+01 s Start 2564: mpi_dst_example_simple_lap_s_facto1_sched1_kway_svdbegin Test #2565: mpi_dst_example_simple_lap_s_facto1_sched1_kway_svdend ..................***Timeout 507.76 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : 1: 300 1140 2: 200 760 3: 200 660 Ordering method is: Scotch Time to compute ordering 7.704292e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.919950e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.049059e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.419159e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.871194e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.276391e-01 s Time to initialize coeftab 3.031762e-01 s Time to factorize 7.559421e+00 s (708.93 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Start 2565: mpi_dst_example_simple_lap_s_facto1_sched1_kway_svdend Test #2567: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_svdend .......***Timeout 507.76 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.743067e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.689870e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.340684e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.200696e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.899729e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.652126e-01 s Time to initialize coeftab 8.036822e-02 s Time to factorize 1.088936e+01 s (492.14 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko Start 2567: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_svdend Test #2568: mpi_dst_example_simple_lap_s_facto1_sched1_not_pqrcpbegin ...............***Timeout 507.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.543825e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.417351e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.802790e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.647953e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.818078e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.749074e-01 s Time to initialize coeftab 7.765235e-01 s Time to factorize 5.712063e+00 s (938.20 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Start 2568: mpi_dst_example_simple_lap_s_facto1_sched1_not_pqrcpbegin Test #2571: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpend ................***Timeout 507.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.488734e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.087803e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.943403e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.634425e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.758782e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.131559e-01 s Time to initialize coeftab 2.201007e-01 s Time to factorize 6.093301e+00 s (879.50 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Start 2571: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpend Test #2572: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpbegin ...***Timeout 507.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.236323e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.432402e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.083905e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.491503e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.484558e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.239270e-01 s Time to initialize coeftab 3.384094e-01 s Time to factorize 1.005776e+01 s (532.83 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Start 2572: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpbegin Test #2573: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpend .....***Timeout 507.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.746029e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.270977e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.070932e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.673662e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.975708e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.858732e-01 s Time to initialize coeftab 1.369036e-01 s Time to factorize 5.030480e+00 s ( 1.04 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Start 2573: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpend Test #2574: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrcpbegin ...............***Timeout 507.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : 1: 300 1140 2: 200 760 3: 200 660 Ordering method is: Scotch Time to compute ordering 5.180339e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.183772e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.388810e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.059192e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.297763e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.488774e-01 s Time to initialize coeftab 5.202350e-01 s Time to factorize 8.054002e+00 s (665.39 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Start 2574: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrcpbegin Test #2576: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrcpbegin ..............***Timeout 507.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.686590e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.562779e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.082368e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.939355e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.801999e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.298901e-01 s Time to initialize coeftab 6.843521e-01 s Time to factorize 8.092025e+00 s (662.27 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Start 2576: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrcpbegin Test #2577: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrcpend ................***Timeout 507.73 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.686338e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.363522e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.275118e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.134757e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.818008e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.625221e-01 s Time to initialize coeftab 1.169347e-01 s Time to factorize 5.866504e+00 s (913.50 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.274594e+01 s Start 2577: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrcpend 2392/3626 Test #2615: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpbegin ...***Timeout 507.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.177824e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.193683e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.035064e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.139688e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.337688e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.286980e-01 s Time to initialize coeftab 8.861463e-01 s Time to factorize 6.645548e+00 s ( 1.50 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko Start 2615: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpbegin 2392/3626 Test #2616: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpend .....***Timeout 507.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.199572e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.127566e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.442292e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.116769e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.739163e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.916568e-01 s Time to initialize coeftab 1.380460e-01 s Time to factorize 4.381957e+00 s ( 2.28 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 2616: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpend 2392/3626 Test #2617: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrrtbegin ...............***Timeout 507.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.131698e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.995188e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.446571e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.415157e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.520070e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.612447e-01 s Time to initialize coeftab 8.820574e-01 s Time to factorize 6.601665e+00 s ( 1.51 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Start 2617: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrrtbegin 2392/3626 Test #2619: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrrtbegin ..............***Timeout 507.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.024995e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.021844e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.595450e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.926946e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.139453e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.643776e-01 s Time to initialize coeftab 3.518398e-01 s Time to factorize 7.165912e+00 s ( 1.39 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 2619: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrrtbegin 2392/3626 Test #2621: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtbegin ...***Timeout 507.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.093281e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.279338e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.552345e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.119350e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.339203e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.568508e-01 s Time to initialize coeftab 4.479909e-01 s Time to factorize 5.540033e+00 s ( 1.80 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Start 2621: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtbegin 2392/3626 Test #2625: mpi_dst_example_simple_lap_d_facto0_sched1_not_svdbegin .................***Timeout 507.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.050766e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.731905e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.812258e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.106858e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.226091e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.128242e-01 s Time to initialize coeftab 2.446857e-01 s Time to factorize 4.780124e+00 s ( 1.06 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2625: mpi_dst_example_simple_lap_d_facto0_sched1_not_svdbegin 2392/3626 Test #2626: mpi_dst_example_simple_lap_d_facto0_sched1_not_svdend ...................***Timeout 507.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.185371e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.506678e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.436236e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.152081e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.389777e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.625414e-01 s Time to initialize coeftab 2.044231e-01 s Time to factorize 4.469546e+00 s ( 1.13 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.879191e-01 s Start 2626: mpi_dst_example_simple_lap_d_facto0_sched1_not_svdend 2392/3626 Test #2627: mpi_dst_example_simple_lap_d_facto0_sched1_kway_svdbegin ................***Timeout 507.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.762032e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.981626e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.115106e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.446226e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.924380e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.593881e-01 s Time to initialize coeftab 8.478524e-01 s Time to factorize 6.752030e+00 s (767.74 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 2627: mpi_dst_example_simple_lap_d_facto0_sched1_kway_svdbegin 2392/3626 Test #2628: mpi_dst_example_simple_lap_d_facto0_sched1_kway_svdend ..................***Timeout 507.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.589749e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.266879e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.086793e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.463622e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.887782e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.255304e+00 s Time to initialize coeftab 8.422230e-02 s Time to factorize 5.013434e+00 s ( 1.01 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.777539e+00 s Start 2628: mpi_dst_example_simple_lap_d_facto0_sched1_kway_svdend 2392/3626 Test #2629: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_svdbegin .....***Timeout 507.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.642918e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.505301e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.306575e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.234602e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.391269e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.659491e-01 s Time to initialize coeftab 3.480686e-01 s Time to factorize 6.534005e+00 s (793.36 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2629: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_svdbegin 2392/3626 Test #2630: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_svdend .......***Timeout 507.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.560157e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.354237e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.847204e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.227925e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.760644e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.448432e-01 s Time to initialize coeftab 1.500147e-01 s Time to factorize 2.657980e+00 s ( 1.90 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2630: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_svdend 2392/3626 Test #2631: mpi_dst_example_simple_lap_d_facto0_sched1_not_pqrcpbegin ...............***Timeout 507.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 3: 200 660 2: 200 760 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.391769e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.411952e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.256864e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.004891e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.531595e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.700873e-01 s Time to initialize coeftab 5.900754e-01 s Time to factorize 3.558267e+00 s ( 1.42 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.169920e-01 s - iteration 1 : total iteration time 13.2 s error 1.8912e-14 Start 2631: mpi_dst_example_simple_lap_d_facto0_sched1_not_pqrcpbegin 2392/3626 Test #2639: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrcpbegin ..............***Timeout 507.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.087382e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.250736e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.751177e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.867985e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.263819e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.421837e-01 s Time to initialize coeftab 4.413306e-01 s Time to factorize 5.230033e+00 s (991.16 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2639: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrcpbegin 2392/3626 Test #2640: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrcpend ................***Timeout 507.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.120734e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.916789e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.611437e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.353463e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.285537e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.208326e-01 s Time to initialize coeftab 1.013482e-01 s Time to factorize 3.921705e+00 s ( 1.29 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 2640: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrcpend 2392/3626 Test #2642: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpend .....***Timeout 507.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.390246e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.213374e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.215607e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.435850e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.503218e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.121394e-01 s Time to initialize coeftab 9.605257e-02 s Time to factorize 1.802044e+00 s ( 2.81 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2642: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpend 2392/3626 Test #2648: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpend .....***Timeout 507.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.548062e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.566730e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.264414e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.097884e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.726713e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.237987e-01 s Time to initialize coeftab 2.064687e-01 s Time to factorize 2.980234e+00 s ( 1.70 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 2648: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpend 2392/3626 Test #2651: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrrtbegin ..............***Timeout 507.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.249558e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.124584e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.181477e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.669850e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.444506e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.048951e-01 s Time to initialize coeftab 8.167947e-01 s Time to factorize 4.276015e+00 s ( 1.18 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2651: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrrtbegin 2392/3626 Test #2652: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrrtend ................***Timeout 507.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.134799e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.166737e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.911231e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.769598e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.447339e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.706780e-01 s Time to initialize coeftab 1.449080e-01 s Time to factorize 4.021824e+00 s ( 1.26 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2652: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrrtend 2392/3626 Test #2653: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtbegin ...***Timeout 507.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.225903e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.785537e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.681114e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.916002e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.376678e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.043817e-01 s Time to initialize coeftab 5.761633e-01 s Time to factorize 4.843506e+00 s ( 1.05 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 2653: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtbegin 2392/3626 Test #2656: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpilu1 ...............***Timeout 507.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.105260e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.280668e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.314387e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.625934e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.292489e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.176498e-01 s Time to initialize coeftab 9.021657e-02 s Time to factorize 3.556292e+00 s ( 1.42 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2656: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpilu1 2392/3626 Test #2657: mpi_dst_example_simple_lap_d_facto1_sched1_not_svdbegin .................***Timeout 507.61 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.514565e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.908506e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.953031e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.216014e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.006149e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.080910e+00 s Time to initialize coeftab 8.582451e-01 s Time to factorize 7.168173e+00 s (747.62 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2657: mpi_dst_example_simple_lap_d_facto1_sched1_not_svdbegin 2392/3626 Test #2661: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_svdbegin .....***Timeout 507.59 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.276268e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.292088e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.380086e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.375120e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.544365e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.161121e-01 s Time to initialize coeftab 1.007007e+00 s Time to factorize 9.821142e+00 s (545.67 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 2661: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_svdbegin 2392/3626 Test #2664: mpi_dst_example_simple_lap_d_facto1_sched1_not_pqrcpend .................***Timeout 507.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.570418e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.470886e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.430740e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.607913e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.789842e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.512642e-01 s Time to initialize coeftab 1.657429e-01 s Time to factorize 5.280506e+00 s (1014.88 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2664: mpi_dst_example_simple_lap_d_facto1_sched1_not_pqrcpend 2392/3626 Test #2665: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpbegin ..............***Timeout 507.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.465024e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.963388e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.933562e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.039564e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.623253e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.356226e-01 s Time to initialize coeftab 3.930783e-01 s Time to factorize 6.672863e+00 s (803.12 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 2665: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpbegin 2392/3626 Test #2666: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpend ................***Timeout 507.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.695217e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.836271e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.923446e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.479641e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.837772e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.748243e-01 s Time to initialize coeftab 1.928880e-01 s Time to factorize 3.209997e+00 s ( 1.63 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 2666: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpend 2392/3626 Test #2668: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpend .....***Timeout 507.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.090635e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.878184e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.104907e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.693491e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.194810e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.629981e-01 s Time to initialize coeftab 6.681149e-02 s Time to factorize 5.943222e+00 s (901.71 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2668: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpend 2392/3626 Test #2669: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrcpbegin ...............***Timeout 507.58 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.972519e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.352852e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.518830e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.070620e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.360280e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.117959e+00 s Time to initialize coeftab 7.273640e-01 s Time to factorize 6.686991e+00 s (801.42 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Start 2669: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrcpbegin 2392/3626 Test #2672: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrcpend ................***Timeout 507.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.162763e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.098771e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.928132e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.046859e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.308021e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.295779e-01 s Time to initialize coeftab 1.810009e-01 s Time to factorize 7.449958e+00 s (719.34 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2672: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrcpend 2392/3626 Test #2673: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpbegin ...***Timeout 507.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.985600e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.597220e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.688562e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.055948e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.140977e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.146451e-01 s Time to initialize coeftab 9.798359e-01 s Time to factorize 8.199451e+00 s (653.59 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2673: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpbegin 2392/3626 Test #2675: mpi_dst_example_simple_lap_d_facto1_sched1_not_tqrcpbegin ...............***Timeout 507.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2675: mpi_dst_example_simple_lap_d_facto1_sched1_not_tqrcpbegin 2392/3626 Test #2676: mpi_dst_example_simple_lap_d_facto1_sched1_not_tqrcpend .................***Timeout 507.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.371708e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.755217e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.174229e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.963325e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.542277e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.814263e-01 s Time to initialize coeftab 8.000130e-02 s Time to factorize 7.903733e+00 s (678.04 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2676: mpi_dst_example_simple_lap_d_facto1_sched1_not_tqrcpend 2392/3626 Test #2677: mpi_dst_example_simple_lap_d_facto1_sched1_kway_tqrcpbegin ..............***Timeout 507.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.421814e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.057213e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.623784e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.631881e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.581484e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.530726e-01 s Time to initialize coeftab 9.642038e-01 s Time to factorize 8.214992e+00 s (652.35 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Start 2677: mpi_dst_example_simple_lap_d_facto1_sched1_kway_tqrcpbegin 2392/3626 Test #2679: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpbegin ...***Timeout 507.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.412718e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.159925e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.967420e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.246482e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.561731e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.674233e-01 s Time to initialize coeftab 4.774549e-01 s Time to factorize 1.076096e+01 s (498.01 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Start 2679: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpbegin 2392/3626 Test #2680: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpend .....***Timeout 507.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.207727e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.330829e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.816133e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.703700e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.448856e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.813145e-01 s Time to initialize coeftab 2.212325e-01 s Time to factorize 9.323873e+00 s (574.77 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2680: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpend 2392/3626 Test #2681: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrrtbegin ...............***Timeout 507.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.834046e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.258477e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.682927e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.727042e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.882983e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.978164e-02 s Time to initialize coeftab 6.980172e-01 s Time to factorize 5.613984e+00 s (954.59 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2681: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrrtbegin 2392/3626 Test #2682: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrrtend .................***Timeout 507.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.169488e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.254225e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.868825e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.225276e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.577181e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.396770e-01 s Time to initialize coeftab 1.108024e-01 s Time to factorize 6.191272e+00 s (865.59 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2682: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrrtend 2392/3626 Test #2683: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrrtbegin ..............***Timeout 507.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.148045e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.749291e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.588302e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.050495e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.339886e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.725421e-01 s Time to initialize coeftab 4.203966e-01 s Time to factorize 9.933961e+00 s (539.47 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2683: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrrtbegin 2392/3626 Test #2690: mpi_dst_example_simple_lap_d_facto2_sched1_not_svdend ...................***Timeout 507.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.505241e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.690158e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.027558e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.188594e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.683608e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.805657e-01 s Time to initialize coeftab 1.564723e-01 s Time to factorize 4.862135e+00 s ( 2.05 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko Start 2690: mpi_dst_example_simple_lap_d_facto2_sched1_not_svdend 2392/3626 Test #2692: mpi_dst_example_simple_lap_d_facto2_sched1_kway_svdend ..................***Timeout 507.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.778805e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.965765e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.573989e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.231675e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.911333e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.719435e-01 s Time to initialize coeftab 8.995917e-02 s Time to factorize 4.119051e+00 s ( 2.42 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2692: mpi_dst_example_simple_lap_d_facto2_sched1_kway_svdend 2392/3626 Test #2693: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_svdbegin .....***Timeout 507.55 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch 1: 300 1140 2: 200 760 3: 200 660 Time to compute ordering 6.904602e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.584000e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.072855e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.649156e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.025202e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.152105e-01 s Time to initialize coeftab 4.475747e-01 s Time to factorize 9.520751e+00 s ( 1.05 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Start 2693: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_svdbegin 2392/3626 Test #2694: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_svdend .......***Timeout 507.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.837192e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.018434e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.066190e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.361942e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.953286e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.193783e-01 s Time to initialize coeftab 3.289628e-01 s Time to factorize 6.202089e+00 s ( 1.61 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Start 2694: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_svdend 2392/3626 Test #2698: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpend ................***Timeout 507.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.289376e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.567213e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.680575e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.463760e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.504130e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.253682e-01 s Time to initialize coeftab 2.260310e-01 s Time to factorize 2.517697e+00 s ( 3.97 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2698: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpend 2392/3626 Test #2699: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpbegin ...***Timeout 507.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.958096e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.550230e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.555031e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.628638e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.106780e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.574745e-02 s Time to initialize coeftab 4.844555e-01 s Time to factorize 3.113962e+00 s ( 3.21 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 8.955485e+00 s Start 2699: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpbegin 2392/3626 Test #2701: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrcpbegin ...............***Timeout 507.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.629157e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.333689e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.892233e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.033761e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.771293e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.500448e-01 s Time to initialize coeftab 4.351079e-01 s Time to factorize 7.014465e+00 s ( 1.42 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Start 2701: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrcpbegin 2392/3626 Test #2702: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrcpend .................***Timeout 507.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.181247e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.635654e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.101648e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.295084e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.281109e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.644036e-01 s Time to initialize coeftab 9.697614e-02 s Time to factorize 4.676415e+00 s ( 2.14 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2702: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrcpend Test #2578: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpbegin ...***Timeout 490.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.001795e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.770458e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.012116e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.817277e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.240910e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.632961e-01 s Time to initialize coeftab 6.855640e-01 s Time to factorize 7.997448e+00 s (670.10 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 1.161403e+01 s Start 2578: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpbegin Test #2582: mpi_dst_example_simple_lap_s_facto1_sched1_kway_tqrcpbegin ..............***Timeout 490.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.500790e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.313099e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.321294e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.544565e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.653909e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.863059e-01 s Time to initialize coeftab 6.877664e-01 s Time to factorize 8.787003e+00 s (609.89 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 1.123761e+01 s Start 2582: mpi_dst_example_simple_lap_s_facto1_sched1_kway_tqrcpbegin Test #2585: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpend .....***Timeout 490.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.553169e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.993886e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.095486e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.130171e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.670185e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.046765e-01 s Time to initialize coeftab 7.375966e-02 s Time to factorize 6.928054e+00 s (773.53 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.139751e+01 s Start 2585: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpend Test #2586: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrrtbegin ...............***Timeout 490.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.011458e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.151076e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.500586e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.101540e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.308653e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.997114e-01 s Time to initialize coeftab 2.689083e-01 s Time to factorize 7.804027e+00 s (686.71 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.136242e+01 s Start 2586: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrrtbegin Test #2588: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrrtbegin ..............***Timeout 490.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.392966e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.539969e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.427076e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.628142e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.635217e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.748049e-01 s Time to initialize coeftab 1.167950e+00 s Time to factorize 5.956508e+00 s (899.70 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.212913e+01 s Start 2588: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrrtbegin Test #2589: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrrtend ................***Timeout 490.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.527447e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.644446e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.510966e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.586095e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.648519e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.379631e-01 s Time to initialize coeftab 8.816102e-02 s Time to factorize 8.096893e+00 s (661.87 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.304503e+01 s Start 2589: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrrtend Test #2590: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtbegin ...***Timeout 490.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.975843e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.471134e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.994983e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.966237e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.140336e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.611466e-01 s Time to initialize coeftab 9.053872e-01 s Time to factorize 7.139292e+00 s (750.65 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.627459e+01 s Start 2590: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtbegin Test #2591: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpilu0 ...............***Timeout 490.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.852674e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.993596e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.018986e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.602520e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.222767e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.372550e-01 s Time to initialize coeftab 8.107661e-02 s Time to factorize 8.973164e+00 s (597.23 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.078607e+01 s Start 2591: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpilu0 Test #2592: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpilu1 ...............***Timeout 490.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.101162e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.677124e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.094430e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.886678e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.476304e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.550569e-01 s Time to initialize coeftab 1.319976e-01 s Time to factorize 5.990004e+00 s (894.67 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.393144e+00 s Start 2592: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpilu1 Test #2593: mpi_dst_example_simple_lap_s_facto2_sched1_not_svdbegin .................***Timeout 490.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.211436e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.678252e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.462896e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.766808e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.342619e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.641752e-01 s Time to initialize coeftab 4.224731e-01 s Time to factorize 8.769210e+00 s ( 1.14 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.061407e+01 s Start 2593: mpi_dst_example_simple_lap_s_facto2_sched1_not_svdbegin Test #2595: mpi_dst_example_simple_lap_s_facto2_sched1_kway_svdbegin ................***Timeout 490.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.970248e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.192204e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.460183e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.105234e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.708282e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.723144e-01 s Time to initialize coeftab 7.866675e-01 s Time to factorize 7.662481e+00 s ( 1.30 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.000949e+01 s Start 2595: mpi_dst_example_simple_lap_s_facto2_sched1_kway_svdbegin Test #2596: mpi_dst_example_simple_lap_s_facto2_sched1_kway_svdend ..................***Timeout 490.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.820119e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.539478e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.232683e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.226753e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.909647e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.482363e-01 s Time to initialize coeftab 5.648439e-02 s Time to factorize 3.092422e+00 s ( 3.23 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 4.487385e+00 s Start 2596: mpi_dst_example_simple_lap_s_facto2_sched1_kway_svdend Test #2597: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_svdbegin .....***Timeout 490.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.040965e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.860597e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.745372e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.623825e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.385055e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.192781e-01 s Time to initialize coeftab 5.692542e-01 s Time to factorize 5.209623e+00 s ( 1.92 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 9.630969e+00 s Start 2597: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_svdbegin Test #2601: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpbegin ..............***Timeout 490.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.537495e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.419503e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.852939e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.268166e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.693913e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.888483e-01 s Time to initialize coeftab 4.014659e-01 s Time to factorize 4.711391e+00 s ( 2.12 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 6.328447e-01 s Start 2601: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpbegin Test #2606: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrcpend .................***Timeout 490.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.107223e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.418864e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.811196e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.595222e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.262622e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.219729e-01 s Time to initialize coeftab 7.428358e-02 s Time to factorize 4.353656e+00 s ( 2.29 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 5.610370e-01 s Start 2606: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrcpend Test #2607: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrcpbegin ..............***Timeout 490.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.101583e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.066493e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.284373e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.972554e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.242925e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.758288e-01 s Time to initialize coeftab 8.300509e-01 s Time to factorize 5.650787e+00 s ( 1.77 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko Start 2607: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrcpbegin Test #2608: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrcpend ................***Timeout 490.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.070039e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.961104e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.518236e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.400658e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.181823e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.350644e-01 s Time to initialize coeftab 2.139327e-01 s Time to factorize 3.460954e+00 s ( 2.88 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Start 2608: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrcpend Test #2609: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpbegin ...***Timeout 490.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.958091e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.392148e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.208012e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.366322e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.147413e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.755948e-01 s Time to initialize coeftab 6.621254e-01 s Time to factorize 7.794764e+00 s ( 1.28 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko Start 2609: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpbegin Test #2610: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpend .....***Timeout 490.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.046740e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.545572e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.065228e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.485498e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.223146e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.521299e-01 s Time to initialize coeftab 8.552622e-02 s Time to factorize 4.374843e+00 s ( 2.28 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 2610: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpend 2392/3626 Test #2704: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrcpend ................***Timeout 490.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.411494e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.949943e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.780591e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.914906e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.728636e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.954386e-01 s Time to initialize coeftab 1.832577e-01 s Time to factorize 3.618990e+00 s ( 2.76 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Start 2704: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrcpend 2392/3626 Test #2705: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpbegin ...***Timeout 445.42 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.108136e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.392179e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.484389e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.959791e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.199574e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.784997e-01 s Time to initialize coeftab 7.002424e-01 s Time to factorize 7.051252e+00 s ( 1.42 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Time to solve 9.104444e+00 s Start 2705: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpbegin 2392/3626 Test #2706: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpend .....***Timeout 445.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.162620e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.527179e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.489143e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.513174e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.360535e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.065155e-01 s Time to initialize coeftab 1.438139e-01 s Time to factorize 3.833284e+00 s ( 2.60 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko Start 2706: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpend 2392/3626 Test #2707: mpi_dst_example_simple_lap_d_facto2_sched1_not_tqrcpbegin ...............***Timeout 445.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.214869e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.573220e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.028548e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.564428e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.366063e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.489970e-01 s Time to initialize coeftab 5.026404e-01 s Time to factorize 7.961455e+00 s ( 1.25 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2707: mpi_dst_example_simple_lap_d_facto2_sched1_not_tqrcpbegin 2392/3626 Test #2711: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpbegin ...***Timeout 432.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.177504e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.536789e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.132182e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.781295e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.405323e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.071224e-01 s Time to initialize coeftab 6.884015e-01 s Time to factorize 6.255894e+00 s ( 1.60 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Start 2711: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpbegin 2392/3626 Test #2713: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrrtbegin ...............***Timeout 312.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.361082e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.588073e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.735283e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.138561e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.516401e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.132671e-01 s Time to initialize coeftab 8.166786e-01 s Time to factorize 5.327852e+00 s ( 1.87 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Start 2713: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrrtbegin Test #2422: mpi_dst_example_simple_lap_z_facto1_sched0_kway_tqrcpbegin .............. Passed 135.74 sec Start 2904: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpend Test #2654: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtend ..... Passed 147.89 sec Start 2905: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrrtbegin Test #2449: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrcpend ................ Passed 159.75 sec Start 2906: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrrtend Test #2434: mpi_dst_example_simple_lap_z_facto2_sched0_not_svdbegin ................. Passed 178.34 sec Start 2907: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrrtbegin Test #2424: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpbegin ... Passed 179.30 sec Start 2908: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrrtend Test #2411: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpend ................ Passed 179.45 sec Start 2909: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtbegin Test #2407: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_svdend ....... Passed 181.08 sec Start 2910: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtend Test #2432: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpilu0 ............... Passed 181.02 sec Start 2911: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpilu0 Test #2463: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtend ..... Passed 181.68 sec Start 2912: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpilu1 Test #2413: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpend ..... Passed 187.22 sec Start 2913: mpi_dst_example_simple_lap_z_facto1_sched1_not_svdbegin Test #2461: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrrtend ................ Passed 194.61 sec Start 2914: mpi_dst_example_simple_lap_z_facto1_sched1_not_svdend Test #2429: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrrtend ................ Passed 195.73 sec Start 2915: mpi_dst_example_simple_lap_z_facto1_sched1_kway_svdbegin Test #2471: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_svdend ....... Passed 196.24 sec Start 2916: mpi_dst_example_simple_lap_z_facto1_sched1_kway_svdend Test #2388: mpi_dst_example_simple_lap_z_facto0_sched0_not_tqrcpbegin ...............***Timeout 302.65 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2389: mpi_dst_example_simple_lap_z_facto0_sched0_not_tqrcpend .................***Timeout 302.65 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2390: mpi_dst_example_simple_lap_z_facto0_sched0_kway_tqrcpbegin ..............***Timeout 302.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2391: mpi_dst_example_simple_lap_z_facto0_sched0_kway_tqrcpend ................***Timeout 302.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2392: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpbegin ...***Timeout 302.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2393: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpend .....***Timeout 302.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.781590e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.845951e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.713496e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.244792e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.049426e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.799556e-01 s Time to initialize coeftab 2.279780e-01 s Time to factorize 1.027302e+01 s ( 1.97 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Test #2394: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrrtbegin ...............***Timeout 302.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.669162e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.853890e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.495430e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.410134e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.826089e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.572046e-01 s Time to initialize coeftab 1.814455e-01 s Time to factorize 1.646594e+01 s ( 1.23 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Test #2395: mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrrtend .................***Timeout 302.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2396: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrrtbegin ..............***Timeout 302.62 sec ischedInit: The thread number has been automatically set to 256 Test #2397: mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrrtend ................***Timeout 302.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2398: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtbegin ...***Timeout 302.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2399: mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtend .....***Timeout 302.60 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2400: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpilu0 ...............***Timeout 302.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2401: mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpilu1 ...............***Timeout 302.59 sec ischedInit: The thread number has been automatically set to 256 Test #2402: mpi_dst_example_simple_lap_z_facto1_sched0_not_svdbegin .................***Timeout 302.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.016491e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.147909e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.303229e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.213130e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.166882e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Test #2403: mpi_dst_example_simple_lap_z_facto1_sched0_not_svdend ...................***Timeout 302.59 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2404: mpi_dst_example_simple_lap_z_facto1_sched0_kway_svdbegin ................***Timeout 302.58 sec ischedInit: The thread number has been automatically set to 256 Test #2405: mpi_dst_example_simple_lap_z_facto1_sched0_kway_svdend ..................***Timeout 302.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2406: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_svdbegin .....***Timeout 302.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2408: mpi_dst_example_simple_lap_z_facto1_sched0_not_pqrcpbegin ...............***Timeout 302.56 sec ischedInit: The thread number has been automatically set to 256 Test #2409: mpi_dst_example_simple_lap_z_facto1_sched0_not_pqrcpend .................***Timeout 302.56 sec Test #2410: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpbegin ..............***Timeout 302.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2412: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpbegin ...***Timeout 302.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2414: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrcpbegin ...............***Timeout 302.54 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2415: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrcpend .................***Timeout 302.53 sec ischedInit: The thread number has been automatically set to 256 Test #2416: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrcpbegin ..............***Timeout 302.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2417: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrcpend ................***Timeout 302.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 Test #2418: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpbegin ...***Timeout 302.52 sec ischedInit: The thread number has been automatically set to 256 Test #2419: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpend .....***Timeout 302.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.934188e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.410603e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.009334e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.268471e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.091286e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Test #2420: mpi_dst_example_simple_lap_z_facto1_sched0_not_tqrcpbegin ...............***Timeout 302.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2421: mpi_dst_example_simple_lap_z_facto1_sched0_not_tqrcpend .................***Timeout 302.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2423: mpi_dst_example_simple_lap_z_facto1_sched0_kway_tqrcpend ................***Timeout 302.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.605632e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.175175e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.090681e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.786779e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.232176e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.217816e+00 s Time to initialize coeftab 2.431964e-01 s Test #2425: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpend .....***Timeout 302.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2427: mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrrtend .................***Timeout 302.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2428: mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrrtbegin ..............***Timeout 302.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2430: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtbegin ...***Timeout 302.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.013326e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.013542e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.554109e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.291574e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.419860e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.959356e-01 s Time to initialize coeftab 6.607079e-01 s Test #2431: mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtend .....***Timeout 302.47 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2433: mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpilu1 ...............***Timeout 302.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2435: mpi_dst_example_simple_lap_z_facto2_sched0_not_svdend ...................***Timeout 302.45 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.731862e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.334033e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.884971e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.911056e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.310363e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.567040e-01 s Time to initialize coeftab 3.034591e-01 s Time to factorize 1.056549e+01 s ( 3.78 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 8.028742e-01 s - iteration 1 : total iteration time 2.3 s error 3.6733e-16 Test #2436: mpi_dst_example_simple_lap_z_facto2_sched0_kway_svdbegin ................***Timeout 302.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2438: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_svdbegin .....***Timeout 302.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2439: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_svdend .......***Timeout 302.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2440: mpi_dst_example_simple_lap_z_facto2_sched0_not_pqrcpbegin ...............***Timeout 302.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2441: mpi_dst_example_simple_lap_z_facto2_sched0_not_pqrcpend .................***Timeout 302.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.804447e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.055894e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.897232e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.808996e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.798883e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.089076e-01 s Time to initialize coeftab 3.027638e-01 s Time to factorize 8.254994e+00 s ( 4.84 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Test #2442: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpbegin ..............***Timeout 302.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2443: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpend ................***Timeout 302.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.735476e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.529730e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.724895e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.489592e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.566937e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.380493e-01 s Time to initialize coeftab 2.211151e-01 s Test #2444: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpbegin ...***Timeout 302.42 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2445: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpend .....***Timeout 302.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2446: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrcpbegin ...............***Timeout 302.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.578039e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.795426e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.642691e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.129009e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.098096e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.888557e-01 s Time to initialize coeftab 1.290759e+00 s Test #2447: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrcpend .................***Timeout 302.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2448: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrcpbegin ..............***Timeout 302.40 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2450: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpbegin ...***Timeout 302.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2451: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpend .....***Timeout 302.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2452: mpi_dst_example_simple_lap_z_facto2_sched0_not_tqrcpbegin ...............***Timeout 302.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2453: mpi_dst_example_simple_lap_z_facto2_sched0_not_tqrcpend .................***Timeout 302.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2454: mpi_dst_example_simple_lap_z_facto2_sched0_kway_tqrcpbegin ..............***Timeout 302.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2455: mpi_dst_example_simple_lap_z_facto2_sched0_kway_tqrcpend ................***Timeout 302.36 sec ischedInit: The thread number has been automatically set to 256 Test #2456: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpbegin ...***Timeout 302.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2457: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpend .....***Timeout 302.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.619663e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.686090e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.083966e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.604774e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.125825e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.568853e-01 s Time to initialize coeftab 2.071419e-01 s Test #2458: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrrtbegin ...............***Timeout 302.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2459: mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrrtend .................***Timeout 302.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2460: mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrrtbegin ..............***Timeout 302.34 sec Test #2462: mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtbegin ...***Timeout 302.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2464: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpilu0 ...............***Timeout 302.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2465: mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpilu1 ...............***Timeout 302.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2466: mpi_dst_example_simple_lap_z_facto3_sched0_not_svdbegin .................***Timeout 302.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.412585e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.032043e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.001333e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.622385e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.120597e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.943403e-01 s Time to initialize coeftab 5.310792e-01 s Test #2467: mpi_dst_example_simple_lap_z_facto3_sched0_not_svdend ...................***Timeout 302.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2468: mpi_dst_example_simple_lap_z_facto3_sched0_kway_svdbegin ................***Timeout 302.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2469: mpi_dst_example_simple_lap_z_facto3_sched0_kway_svdend ..................***Timeout 302.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2470: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_svdbegin .....***Timeout 302.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2472: mpi_dst_example_simple_lap_z_facto3_sched0_not_pqrcpbegin ...............***Timeout 302.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2473: mpi_dst_example_simple_lap_z_facto3_sched0_not_pqrcpend .................***Timeout 302.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2474: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpbegin ..............***Timeout 302.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2475: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpend ................***Timeout 302.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2476: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpbegin ...***Timeout 302.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2477: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpend .....***Timeout 302.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2478: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrcpbegin ...............***Timeout 302.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2479: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrcpend .................***Timeout 302.21 sec ischedInit: The thread number has been automatically set to 256 Test #2480: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrcpbegin ..............***Timeout 302.20 sec ischedInit: The thread number has been automatically set to 256 Test #2481: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrcpend ................***Timeout 302.20 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2482: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpbegin ...***Timeout 302.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2483: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpend .....***Timeout 302.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2484: mpi_dst_example_simple_lap_z_facto3_sched0_not_tqrcpbegin ...............***Timeout 302.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2485: mpi_dst_example_simple_lap_z_facto3_sched0_not_tqrcpend .................***Timeout 302.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2486: mpi_dst_example_simple_lap_z_facto3_sched0_kway_tqrcpbegin ..............***Timeout 302.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2487: mpi_dst_example_simple_lap_z_facto3_sched0_kway_tqrcpend ................***Timeout 302.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2488: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpbegin ...***Timeout 302.16 sec ischedInit: The thread number has been automatically set to 256 Test #2489: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpend .....***Timeout 302.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2490: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrrtbegin ...............***Timeout 302.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2491: mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrrtend .................***Timeout 302.14 sec ischedInit: The thread number has been automatically set to 256 Test #2492: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrrtbegin ..............***Timeout 302.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2493: mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrrtend ................***Timeout 302.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2494: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtbegin ...***Timeout 302.12 sec ischedInit: The thread number has been automatically set to 256 Test #2495: mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtend .....***Timeout 302.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2496: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpilu0 ...............***Timeout 302.11 sec Test #2497: mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpilu1 ...............***Timeout 302.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2498: mpi_dst_example_simple_lap_z_facto4_sched0_not_svdbegin .................***Timeout 302.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2500: mpi_dst_example_simple_lap_z_facto4_sched0_kway_svdbegin ................***Timeout 302.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2501: mpi_dst_example_simple_lap_z_facto4_sched0_kway_svdend ..................***Timeout 302.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2502: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_svdbegin .....***Timeout 302.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2503: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_svdend .......***Timeout 302.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2504: mpi_dst_example_simple_lap_z_facto4_sched0_not_pqrcpbegin ...............***Timeout 302.07 sec Test #2505: mpi_dst_example_simple_lap_z_facto4_sched0_not_pqrcpend .................***Timeout 302.06 sec ischedInit: The thread number has been automatically set to 256 Test #2506: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpbegin ..............***Timeout 302.06 sec ischedInit: The thread number has been automatically set to 256 Test #2507: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpend ................***Timeout 302.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2508: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpbegin ...***Timeout 302.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.603302e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.085733e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.045300e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.629782e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.554894e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 7.376793e-01 s Time to initialize coeftab 1.451441e-01 s Time to factorize 1.243812e+01 s ( 1.71 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 5.657842e-02 s - iteration 1 : total iteration time 0.0923 s error 1.199e-14 Time for refinement 1.644115e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.199051e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.199051e-14 max(|| b_i - A x_i ||_1) 1.885659e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.758158e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.199051e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.199051e-14 max(|| b_i - A x_i ||_1) 1.885659e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.758158e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.885659e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.758158e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.885659e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.758158e-02 (SUCCESS) Test #2509: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpend .....***Timeout 302.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.156570e+00 s Test #2510: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrcpbegin ...............***Timeout 302.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2511: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrcpend .................***Timeout 302.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.029739e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.715174e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.984639e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.151345e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.136180e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.156758e+00 s Time to initialize coeftab 1.060255e-01 s Test #2512: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrcpbegin ..............***Timeout 302.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2513: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrcpend ................***Timeout 302.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2514: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpbegin ...***Timeout 302.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2515: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpend .....***Timeout 302.00 sec ischedInit: The thread number has been automatically set to 256 Test #2646: mpi_dst_example_simple_lap_d_facto0_sched1_kway_tqrcpend ................ Passed 291.37 sec Test #2516: mpi_dst_example_simple_lap_z_facto4_sched0_not_tqrcpbegin ...............***Timeout 301.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.546888e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.751515e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.134232e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.729260e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.145834e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.375112e-01 s Time to initialize coeftab 3.425579e-01 s Time to factorize 7.979273e+00 s ( 2.67 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.057832e-01 s - iteration 1 : total iteration time 0.102 s error 1.5512e-14 Time for refinement 2.605918e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550833e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550833e-14 max(|| b_i - A x_i ||_1) 2.328894e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876590e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550833e-14 max(|| b_i - A x_i ||_1) 2.328894e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876590e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550833e-14 max(|| b_i - A x_i ||_1) 2.328894e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876590e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.328894e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876590e-02 (SUCCESS) Test #2517: mpi_dst_example_simple_lap_z_facto4_sched0_not_tqrcpend .................***Timeout 301.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.143257e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.324609e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.490636e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.222522e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.380002e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.992354e-01 s Time to initialize coeftab 9.500676e-02 s Time to factorize 1.527556e+01 s ( 1.39 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 2.202779e-02 s - iteration 1 : total iteration time 0.0566 s error 3.419e-16 Time for refinement 1.122783e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613349e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613349e-16 max(|| b_i - A x_i ||_1) 8.485980e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141301e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.485980e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141301e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613349e-16 max(|| b_i - A x_i ||_1) 8.485980e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141301e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613349e-16 max(|| b_i - A x_i ||_1) 8.485980e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141301e-03 (SUCCESS) Test #2519: mpi_dst_example_simple_lap_z_facto4_sched0_kway_tqrcpend ................***Timeout 303.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.192486e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.058032e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.901614e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.844697e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.268439e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.960232e-01 s Time to initialize coeftab 2.003993e-01 s Time to factorize 3.641763e+00 s ( 5.85 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 6.364767e-02 s - iteration 1 : total iteration time 0.11 s error 3.419e-16 Time for refinement 2.154686e-01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.613376e-16 max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.486531e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.141439e-03 (SUCCESS) Test #2520: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpbegin ...***Timeout 303.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.116480e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.535977e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.282251e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.572801e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.374322e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.094103e-01 s Time to initialize coeftab 5.404836e-01 s Time to factorize 5.251120e+00 s ( 4.06 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.573461e-01 s - iteration 1 : total iteration time 0.109 s error 1.5512e-14 Time for refinement 2.164905e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.550838e-14 max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.329037e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.876950e-02 (SUCCESS) Test #2523: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrrtend .................***Timeout 305.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.345678e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.558748e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.231936e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.805192e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.119696e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.423365e-01 s Time to initialize coeftab 5.156360e-01 s Time to factorize 3.644565e+00 s ( 5.85 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 6.599849e-02 s - iteration 1 : total iteration time 0.0262 s error 3.6647e-16 Time for refinement 7.939415e-02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.922723e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.922723e-16 max(|| b_i - A x_i ||_1) 9.386057e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.368421e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.922723e-16 max(|| b_i - A x_i ||_1) 9.386057e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.368421e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 9.386057e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.368421e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.922723e-16 max(|| b_i - A x_i ||_1) 9.386057e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.368421e-03 (SUCCESS) Test #2525: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrrtend ................***Timeout 306.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.534101e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.318815e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.278677e-03 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.851164e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.301087e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.717983e-01 s Time to initialize coeftab 2.620388e-01 s Time to factorize 3.715629e+00 s ( 5.73 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 2.056950e-01 s - iteration 1 : total iteration time 0.344 s error 3.6647e-16 Time for refinement 1.188904e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.920126e-16 max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 9.370724e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.364551e-03 (SUCCESS) Test #2526: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtbegin ...***Timeout 306.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.262060e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.588339e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.167172e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.996700e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.349600e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 8.063533e-02 s Time to initialize coeftab 3.457904e-01 s Time to factorize 2.923905e+00 s ( 7.29 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 5.770643e-02 s - iteration 1 : total iteration time 0.0416 s error 4.5455e-09 - iteration 2 : total iteration time 0.0497 s error 5.4821e-10 - iteration 3 : total iteration time 0.0707 s error 3.7667e-13 Time for refinement 1.948401e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.766590e-13 max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.259507e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.701503e-01 (SUCCESS) Test #2527: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtend .....***Timeout 306.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.530235e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.206047e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.942961e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.597163e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.768989e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.755889e-01 s Time to initialize coeftab 5.172127e-01 s Time to factorize 3.355301e+00 s ( 6.35 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.080898e-01 s - iteration 1 : total iteration time 0.0795 s error 3.6647e-16 Time for refinement 1.847435e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.922723e-16 max(|| b_i - A x_i ||_1) 9.386057e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.368421e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.922723e-16 max(|| b_i - A x_i ||_1) 9.386057e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.368421e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.922723e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.922723e-16 max(|| b_i - A x_i ||_1) 9.386057e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.368421e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 9.386057e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.368421e-03 (SUCCESS) Test #2529: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpilu1 ...............***Timeout 307.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.215748e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.882458e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.159398e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.829030e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.567681e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.839152e-01 s Time to initialize coeftab 2.871768e-01 s Time to factorize 6.814738e+00 s ( 3.13 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 9.888131e-02 s - iteration 1 : total iteration time 0.192 s error 6.1304e-15 Time for refinement 2.664688e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.132569e-15 max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 8.970997e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.263687e-02 (SUCCESS) Test #2531: mpi_dst_example_simple_lap_s_facto0_sched1_not_svdend ...................***Timeout 306.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.605025e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.777609e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.177496e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.183626e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.897868e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.704123e-01 s Time to initialize coeftab 2.531755e-01 s Time to factorize 1.198495e+00 s ( 4.22 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 6.622967e-01 s Time for refinement 3.634356e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.843540e-07 max(|| b_i - A x_i ||_1) 8.437654e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.060268e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.843540e-07 max(|| b_i - A x_i ||_1) 8.437654e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.060268e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.843540e-07 max(|| b_i - A x_i ||_1) 8.437654e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.060268e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.843540e-07 max(|| b_i - A x_i ||_1) 8.437654e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.060268e+00 (SUCCESS) Test #2533: mpi_dst_example_simple_lap_s_facto0_sched1_kway_svdend ..................***Timeout 306.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.616116e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.962445e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.350439e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.821713e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.843043e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.470853e-02 s Time to initialize coeftab 5.165919e-01 s Time to factorize 8.792987e+00 s (589.54 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.088147e-01 s Time for refinement 1.339164e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.903777e-07 max(|| b_i - A x_i ||_1) 8.576822e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.077755e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.903777e-07 max(|| b_i - A x_i ||_1) 8.576822e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.077755e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.903777e-07 max(|| b_i - A x_i ||_1) 8.576822e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.077755e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.903777e-07 max(|| b_i - A x_i ||_1) 8.576822e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.077755e+00 (SUCCESS) Test #2534: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_svdbegin .....***Timeout 306.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.878708e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.645091e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.120015e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.650713e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.124279e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.249837e-01 s Time to initialize coeftab 3.960558e-01 s Time to factorize 2.036287e+01 s (254.57 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.153872e+00 s Time for refinement 2.554682e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.139573e-07 max(|| b_i - A x_i ||_1) 9.689862e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.217619e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.139573e-07 max(|| b_i - A x_i ||_1) 9.689862e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.217619e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.139573e-07 max(|| b_i - A x_i ||_1) 9.689862e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.217619e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.139573e-07 max(|| b_i - A x_i ||_1) 9.689862e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.217619e+00 (SUCCESS) Test #2535: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_svdend .......***Timeout 306.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.263729e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.519713e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.794998e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.150475e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.610137e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.024512e-01 s Time to initialize coeftab 2.197921e-01 s Time to factorize 5.121978e+00 s (1012.07 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.252115e-01 s Time for refinement 3.971597e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.952150e-07 max(|| b_i - A x_i ||_1) 8.660193e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.088232e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.952150e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.952150e-07 max(|| b_i - A x_i ||_1) 8.660193e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.088232e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.660193e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.088232e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.952150e-07 max(|| b_i - A x_i ||_1) 8.660193e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.088232e+00 (SUCCESS) Test #2537: mpi_dst_example_simple_lap_s_facto0_sched1_not_pqrcpend .................***Timeout 307.60 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.761300e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.899512e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.229906e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.070126e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.080670e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.301861e-01 s Time to initialize coeftab 9.245203e-01 s Time to factorize 1.143004e+00 s ( 4.43 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.716747e-01 s Time for refinement 2.876787e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.966139e-07 max(|| b_i - A x_i ||_1) 8.665548e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.088904e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.966139e-07 max(|| b_i - A x_i ||_1) 8.665548e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.088904e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.966139e-07 max(|| b_i - A x_i ||_1) 8.665548e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.088904e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.966139e-07 max(|| b_i - A x_i ||_1) 8.665548e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.088904e+00 (SUCCESS) Test #2539: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpend ................***Timeout 308.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.831401e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.292341e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.236048e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.042253e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.926401e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.904735e-01 s Time to initialize coeftab 6.927591e-02 s Time to factorize 7.730135e-01 s ( 6.55 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.964050e-01 s Time for refinement 1.509818e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.198093e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.198093e-07 max(|| b_i - A x_i ||_1) 1.288176e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.618709e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.198093e-07 max(|| b_i - A x_i ||_1) 1.288176e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.618709e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.288176e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.618709e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.198093e-07 max(|| b_i - A x_i ||_1) 1.288176e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.618709e+00 (SUCCESS) Test #2544: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrcpbegin ..............***Timeout 312.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.706909e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.795838e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.923673e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.605660e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.257516e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.901706e-01 s Time to initialize coeftab 4.310391e-01 s Time to factorize 5.671998e+00 s (913.93 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 9.701539e-01 s - iteration 1 : total iteration time 4.49 s error 5.1747e-11 Time for refinement 7.045611e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.879727e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.879727e-08 max(|| b_i - A x_i ||_1) 2.897947e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.641532e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.879727e-08 max(|| b_i - A x_i ||_1) 2.897947e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.641532e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.897947e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.641532e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.879727e-08 max(|| b_i - A x_i ||_1) 2.897947e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.641532e-01 (SUCCESS) Test #2547: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpend .....***Timeout 314.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.289262e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.248794e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.918000e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.066865e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.445179e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.166722e-01 s Time to initialize coeftab 2.289772e-01 s Time to factorize 3.766783e+00 s ( 1.34 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.852855e+00 s Time for refinement 1.388694e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.706173e-07 max(|| b_i - A x_i ||_1) 1.157801e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.454882e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.706173e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.706173e-07 max(|| b_i - A x_i ||_1) 1.157801e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.454882e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.706173e-07 max(|| b_i - A x_i ||_1) 1.157801e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.454882e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.157801e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.454882e+00 (SUCCESS) Test #2548: mpi_dst_example_simple_lap_s_facto0_sched1_not_tqrcpbegin ...............***Timeout 314.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.076484e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.222235e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.747749e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.169717e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.255657e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.859198e-01 s Time to initialize coeftab 5.500990e-01 s Time to factorize 8.187570e+00 s (633.13 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 2.180424e+00 s - iteration 1 : total iteration time 3.89 s error 6.9352e-11 Time for refinement 7.423855e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.069359e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.069359e-08 max(|| b_i - A x_i ||_1) 3.012385e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785334e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.069359e-08 max(|| b_i - A x_i ||_1) 3.012385e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785334e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.069359e-08 max(|| b_i - A x_i ||_1) 3.012385e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785334e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.012385e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.785334e-01 (SUCCESS) Test #2551: mpi_dst_example_simple_lap_s_facto0_sched1_kway_tqrcpend ................***Timeout 314.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.560756e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.765543e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.654978e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.131335e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.383884e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.306701e-01 s Time to initialize coeftab 1.259710e-01 s Time to factorize 1.971011e+00 s ( 2.57 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.181111e+00 s Time for refinement 5.146959e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.152759e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.152759e-07 max(|| b_i - A x_i ||_1) 1.294602e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.626784e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.152759e-07 max(|| b_i - A x_i ||_1) 1.294602e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.626784e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.152759e-07 max(|| b_i - A x_i ||_1) 1.294602e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.626784e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.294602e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.626784e+00 (SUCCESS) Test #2553: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpend .....***Timeout 315.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.355820e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.655193e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.860622e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.062770e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.377213e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.155973e-01 s Time to initialize coeftab 1.050078e-01 s Time to factorize 7.334060e-01 s ( 6.90 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.739739e-01 s Time for refinement 6.202573e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.732353e-07 max(|| b_i - A x_i ||_1) 1.186547e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.491003e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.732353e-07 max(|| b_i - A x_i ||_1) 1.186547e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.491003e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.732353e-07 max(|| b_i - A x_i ||_1) 1.186547e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.491003e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.732353e-07 max(|| b_i - A x_i ||_1) 1.186547e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.491003e+00 (SUCCESS) Test #2556: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrrtbegin ..............***Timeout 315.65 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.466168e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.565898e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.421751e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.380278e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.857748e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.367240e-01 s Time to initialize coeftab 4.176038e-01 s Time to factorize 8.902243e+00 s (582.30 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.167011e+00 s - iteration 1 : total iteration time 1.71 s error 5.6643e-11 Time for refinement 4.696326e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.069947e-08 max(|| b_i - A x_i ||_1) 2.978867e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.743216e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.069947e-08 max(|| b_i - A x_i ||_1) 2.978867e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.743216e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.069947e-08 max(|| b_i - A x_i ||_1) 2.978867e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.743216e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.069947e-08 max(|| b_i - A x_i ||_1) 2.978867e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.743216e-01 (SUCCESS) Test #2563: mpi_dst_example_simple_lap_s_facto1_sched1_not_svdend ...................***Timeout 319.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.578115e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.692763e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.441610e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.999470e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.889059e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.133017e-01 s Time to initialize coeftab 2.234849e-01 s Time to factorize 9.053468e-01 s ( 5.78 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.249705e-01 s Time for refinement 2.961010e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.691757e-07 max(|| b_i - A x_i ||_1) 7.499420e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.423698e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.691757e-07 max(|| b_i - A x_i ||_1) 7.499420e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.423698e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.691757e-07 max(|| b_i - A x_i ||_1) 7.499420e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.423698e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.691757e-07 max(|| b_i - A x_i ||_1) 7.499420e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.423698e-01 (SUCCESS) Test #2613: mpi_dst_example_simple_lap_s_facto2_sched1_kway_tqrcpbegin ..............***Timeout 322.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.570467e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.296585e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.393044e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.696863e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.229037e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.975641e-01 s Time to initialize coeftab 4.181637e-01 s Time to factorize 4.898559e+00 s ( 2.04 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.452016e+00 s - iteration 1 : total iteration time 1.77 s error 3.8478e-11 Time for refinement 3.827769e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.154543e-08 max(|| b_i - A x_i ||_1) 2.968651e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.730379e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.154543e-08 max(|| b_i - A x_i ||_1) 2.968651e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.730379e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.154543e-08 max(|| b_i - A x_i ||_1) 2.968651e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.730379e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.154543e-08 max(|| b_i - A x_i ||_1) 2.968651e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.730379e-01 (SUCCESS) Start 2613: mpi_dst_example_simple_lap_s_facto2_sched1_kway_tqrcpbegin Test #2618: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrrtend .................***Timeout 322.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.216513e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.048758e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.905674e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.726104e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.423308e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.057854e-01 s Time to initialize coeftab 1.625199e-01 s Time to factorize 1.172667e+00 s ( 8.51 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 4.824007e-01 s - iteration 1 : total iteration time 3.85 s error 1.0406e-12 Time for refinement 6.119559e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.543701e-08 max(|| b_i - A x_i ||_1) 2.705287e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.399438e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.543701e-08 max(|| b_i - A x_i ||_1) 2.705287e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.399438e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.543701e-08 max(|| b_i - A x_i ||_1) 2.705287e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.399438e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.543701e-08 max(|| b_i - A x_i ||_1) 2.705287e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.399438e-01 (SUCCESS) Start 2618: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrrtend Test #2623: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpilu0 ...............***Timeout 323.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.165470e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.258370e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.819260e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.110365e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.398878e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.877426e-01 s Time to initialize coeftab 1.763209e-01 s Time to factorize 2.545246e+00 s ( 3.92 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 8.223987e-01 s - iteration 1 : total iteration time 2.17 s error 7.8765e-11 Time for refinement 7.165880e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.307538e-08 max(|| b_i - A x_i ||_1) 2.974596e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.737848e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.307538e-08 max(|| b_i - A x_i ||_1) 2.974596e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.737848e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.307538e-08 max(|| b_i - A x_i ||_1) 2.974596e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.737848e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.307538e-08 max(|| b_i - A x_i ||_1) 2.974596e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.737848e-01 (SUCCESS) Start 2623: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpilu0 Test #2624: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpilu1 ...............***Timeout 323.92 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.991479e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.899934e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.451111e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.389586e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.908842e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.330566e-01 s Time to initialize coeftab 5.531849e-02 s Time to factorize 1.338945e+00 s ( 7.46 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 5.303966e-01 s - iteration 1 : total iteration time 1.58 s error 7.9752e-11 Time for refinement 4.031224e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.803444e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.803444e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.803444e-08 max(|| b_i - A x_i ||_1) 2.814199e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.536295e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.803444e-08 max(|| b_i - A x_i ||_1) 2.814199e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.536295e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.814199e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.536295e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.814199e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.536295e-01 (SUCCESS) Start 2624: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpilu1 Test #2632: mpi_dst_example_simple_lap_d_facto0_sched1_not_pqrcpend .................***Timeout 321.56 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.335936e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.937110e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.869246e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.110485e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.879286e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.144834e-01 s Time to initialize coeftab 2.145802e-01 s Time to factorize 2.287108e+00 s ( 2.21 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.185655e+00 s - iteration 1 : total iteration time 7.09 s error 3.6705e-16 Time for refinement 1.497940e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.880711e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.880711e-16 max(|| b_i - A x_i ||_1) 8.116869e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.019954e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.880711e-16 max(|| b_i - A x_i ||_1) 8.116869e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.019954e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.880711e-16 max(|| b_i - A x_i ||_1) 8.116869e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.019954e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.116869e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.019954e-03 (SUCCESS) Start 2632: mpi_dst_example_simple_lap_d_facto0_sched1_not_pqrcpend Test #2634: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpend ................***Timeout 322.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.435271e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.666190e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.833283e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.551859e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.924452e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.499419e-01 s Time to initialize coeftab 3.723447e-01 s Time to factorize 8.600922e-01 s ( 5.89 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.543503e+00 s - iteration 1 : total iteration time 0.806 s error 3.4579e-16 Time for refinement 2.089414e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.604870e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.604870e-16 max(|| b_i - A x_i ||_1) 7.538042e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.472198e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.604870e-16 max(|| b_i - A x_i ||_1) 7.538042e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.472198e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 7.538042e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.472198e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.604870e-16 max(|| b_i - A x_i ||_1) 7.538042e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.472198e-04 (SUCCESS) Start 2634: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpend Test #2638: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrcpend .................***Timeout 324.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.395604e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.615711e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.673881e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.309859e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.368055e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.532520e-01 s Time to initialize coeftab 5.819345e-01 s Time to factorize 1.380271e+00 s ( 3.67 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.177280e-01 s - iteration 1 : total iteration time 3.51 s error 5.2278e-15 Time for refinement 7.712908e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.230545e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.230545e-15 max(|| b_i - A x_i ||_1) 5.817966e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.310774e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.230545e-15 max(|| b_i - A x_i ||_1) 5.817966e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.310774e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.230545e-15 max(|| b_i - A x_i ||_1) 5.817966e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.310774e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 5.817966e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.310774e-03 (SUCCESS) Start 2638: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrcpend Test #2643: mpi_dst_example_simple_lap_d_facto0_sched1_not_tqrcpbegin ...............***Timeout 325.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.890635e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.119495e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.095294e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.934181e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.040118e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.189009e-01 s Time to initialize coeftab 1.539010e-01 s Time to factorize 2.214525e+00 s ( 2.29 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Time to solve 1.679643e+00 s - iteration 1 : total iteration time 1.66 s error 3.9496e-14 Time for refinement 5.550027e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.949187e-14 max(|| b_i - A x_i ||_1) 7.990315e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.004052e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.949187e-14 max(|| b_i - A x_i ||_1) 7.990315e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.004052e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.949187e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.949187e-14 max(|| b_i - A x_i ||_1) 7.990315e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.004052e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.990315e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.004052e-01 (SUCCESS) Start 2643: mpi_dst_example_simple_lap_d_facto0_sched1_not_tqrcpbegin Test #2644: mpi_dst_example_simple_lap_d_facto0_sched1_not_tqrcpend .................***Timeout 325.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.276310e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.000194e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.721558e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.988411e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.543709e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.163532e-02 s Time to initialize coeftab 2.259542e-01 s Time to factorize 1.725987e+00 s ( 2.93 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.440998e-01 s - iteration 1 : total iteration time 4.59 s error 8.3013e-15 Time for refinement 7.408919e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.308384e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.308384e-15 max(|| b_i - A x_i ||_1) 8.316148e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.044995e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.308384e-15 max(|| b_i - A x_i ||_1) 8.316148e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.044995e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 8.316148e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.044995e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.308384e-15 max(|| b_i - A x_i ||_1) 8.316148e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.044995e-02 (SUCCESS) Start 2644: mpi_dst_example_simple_lap_d_facto0_sched1_not_tqrcpend Test #2645: mpi_dst_example_simple_lap_d_facto0_sched1_kway_tqrcpbegin ..............***Timeout 325.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.367973e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.858180e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.654270e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.496341e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.790828e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.703219e-01 s Time to initialize coeftab 5.076963e-01 s Time to factorize 1.578229e+01 s (328.46 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Time to solve 1.156357e+00 s - iteration 1 : total iteration time 4.06 s error 6.553e-14 Time for refinement 1.063305e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.553050e-14 max(|| b_i - A x_i ||_1) 1.159125e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.456540e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.553050e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.553050e-14 max(|| b_i - A x_i ||_1) 1.159125e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.456540e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 1.159125e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.456540e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.553050e-14 max(|| b_i - A x_i ||_1) 1.159125e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.456540e-01 (SUCCESS) Start 2645: mpi_dst_example_simple_lap_d_facto0_sched1_kway_tqrcpbegin Test #2650: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrrtend .................***Timeout 327.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.149269e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.124398e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.562072e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.403252e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.497616e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.716651e-01 s Time to initialize coeftab 7.607310e-01 s Time to factorize 1.036372e+00 s ( 4.88 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.961702e-01 s - iteration 1 : total iteration time 1.47 s error 7.472e-14 Time for refinement 5.043505e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.472152e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.472152e-14 max(|| b_i - A x_i ||_1) 4.488170e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.639772e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.472152e-14 max(|| b_i - A x_i ||_1) 4.488170e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.639772e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.472152e-14 max(|| b_i - A x_i ||_1) 4.488170e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.639772e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 4.488170e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.639772e-02 (SUCCESS) Start 2650: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrrtend Test #2658: mpi_dst_example_simple_lap_d_facto1_sched1_not_svdend ...................***Timeout 328.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.374585e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.331611e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.058378e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.352842e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.556641e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.562992e-01 s Time to initialize coeftab 2.368347e-01 s Time to factorize 2.333745e+00 s ( 2.24 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.191344e+00 s - iteration 1 : total iteration time 4.24 s error 3.4639e-16 Time for refinement 7.378940e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.681889e-16 max(|| b_i - A x_i ||_1) 8.885916e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.116592e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.681889e-16 max(|| b_i - A x_i ||_1) 8.885916e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.116592e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.681889e-16 max(|| b_i - A x_i ||_1) 8.885916e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.116592e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.681889e-16 max(|| b_i - A x_i ||_1) 8.885916e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.116592e-03 (SUCCESS) Start 2658: mpi_dst_example_simple_lap_d_facto1_sched1_not_svdend Test #2663: mpi_dst_example_simple_lap_d_facto1_sched1_not_pqrcpbegin ...............***Timeout 331.59 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.218842e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.190213e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.038957e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.105265e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.424378e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.634393e-01 s Time to initialize coeftab 2.910743e-01 s Time to factorize 1.583352e+00 s ( 3.31 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.632832e+00 s - iteration 1 : total iteration time 3.96 s error 1.7778e-14 Time for refinement 5.654171e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.778255e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.778255e-14 max(|| b_i - A x_i ||_1) 3.133020e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.936909e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.133020e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.936909e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.778255e-14 max(|| b_i - A x_i ||_1) 3.133020e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.936909e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.778255e-14 max(|| b_i - A x_i ||_1) 3.133020e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.936909e-02 (SUCCESS) Start 2663: mpi_dst_example_simple_lap_d_facto1_sched1_not_pqrcpbegin Test #2674: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpend .....***Timeout 334.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.129449e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.866479e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.467964e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.393270e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.289583e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.918556e-01 s Time to initialize coeftab 3.422402e-01 s Time to factorize 1.754281e+00 s ( 2.98 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.825024e+00 s - iteration 1 : total iteration time 2.48 s error 7.4499e-15 Time for refinement 6.744117e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.452260e-15 max(|| b_i - A x_i ||_1) 7.253212e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.114284e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.452260e-15 max(|| b_i - A x_i ||_1) 7.253212e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.114284e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.452260e-15 max(|| b_i - A x_i ||_1) 7.253212e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.114284e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.452260e-15 max(|| b_i - A x_i ||_1) 7.253212e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.114284e-03 (SUCCESS) Start 2674: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpend Test #2685: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtbegin ...***Timeout 336.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.312127e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.693297e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.383173e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.215649e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.506735e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.385641e-01 s Time to initialize coeftab 3.090849e-01 s Time to factorize 2.005415e+00 s ( 2.61 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.773370e-01 s - iteration 1 : total iteration time 2.71 s error 1.6307e-12 - iteration 2 : total iteration time 5.22 s error 4.7834e-18 Time for refinement 1.340317e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.296702e-16 max(|| b_i - A x_i ||_1) 6.399525e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.041554e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.296702e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.296702e-16 max(|| b_i - A x_i ||_1) 6.399525e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.041554e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 6.399525e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.041554e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.296702e-16 max(|| b_i - A x_i ||_1) 6.399525e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.041554e-04 (SUCCESS) Start 2685: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtbegin Test #2687: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpilu0 ...............***Timeout 337.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.236526e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.018247e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.052914e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.359796e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.522293e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.247865e-01 s Time to initialize coeftab 2.102267e-01 s Time to factorize 1.910918e+00 s ( 2.74 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.184094e-01 s - iteration 1 : total iteration time 2.44 s error 7.9703e-15 Time for refinement 5.875586e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.966943e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.966943e-15 max(|| b_i - A x_i ||_1) 1.286374e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.616439e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.966943e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.966943e-15 max(|| b_i - A x_i ||_1) 1.286374e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.616439e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.286374e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.616439e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.286374e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.616439e-02 (SUCCESS) Start 2687: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpilu0 Test #2688: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpilu1 ...............***Timeout 337.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.189372e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.023509e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.512166e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.649033e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.415788e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.846388e-01 s Time to initialize coeftab 3.594558e-01 s Time to factorize 2.154815e+00 s ( 2.43 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.865092e-01 s - iteration 1 : total iteration time 3.52 s error 1.1406e-14 Time for refinement 6.586051e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.141254e-14 max(|| b_i - A x_i ||_1) 1.689766e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.123337e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.141254e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.141254e-14 max(|| b_i - A x_i ||_1) 1.689766e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.123337e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.141254e-14 max(|| b_i - A x_i ||_1) 1.689766e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.123337e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.689766e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.123337e-02 (SUCCESS) Start 2688: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpilu1 Test #2691: mpi_dst_example_simple_lap_d_facto2_sched1_kway_svdbegin ................***Timeout 337.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.626193e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.828364e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.336095e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.170436e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.825535e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.781756e-01 s Time to initialize coeftab 4.702314e-01 s Time to factorize 3.658552e+00 s ( 2.73 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.871946e+00 s - iteration 1 : total iteration time 1.87 s error 2.4693e-14 Time for refinement 4.467045e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.468696e-14 max(|| b_i - A x_i ||_1) 4.352621e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.469443e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.468696e-14 max(|| b_i - A x_i ||_1) 4.352621e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.469443e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.468696e-14 max(|| b_i - A x_i ||_1) 4.352621e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.469443e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.468696e-14 max(|| b_i - A x_i ||_1) 4.352621e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.469443e-02 (SUCCESS) Start 2691: mpi_dst_example_simple_lap_d_facto2_sched1_kway_svdbegin Test #2695: mpi_dst_example_simple_lap_d_facto2_sched1_not_pqrcpbegin ...............***Timeout 337.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.067428e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.267737e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.462282e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.821367e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.382076e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.229526e-01 s Time to initialize coeftab 3.075727e-01 s Time to factorize 2.721799e+00 s ( 3.67 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 5.415565e-01 s - iteration 1 : total iteration time 1.91 s error 1.8834e-14 Time for refinement 5.632601e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.883947e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.883947e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.883947e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.883947e-14 max(|| b_i - A x_i ||_1) 3.306832e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.155319e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.306832e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.155319e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.306832e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.155319e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.306832e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.155319e-02 (SUCCESS) Start 2695: mpi_dst_example_simple_lap_d_facto2_sched1_not_pqrcpbegin Test #2579: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpend .....***Timeout 341.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.428172e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.291983e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.475480e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.071346e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.777724e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.585704e-01 s Time to initialize coeftab 5.941466e-01 s Time to factorize 1.992258e+00 s ( 2.63 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.397026e+00 s Time for refinement 2.539488e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.634554e-07 max(|| b_i - A x_i ||_1) 1.061597e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.333992e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.634554e-07 max(|| b_i - A x_i ||_1) 1.061597e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.333992e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.634554e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.634554e-07 max(|| b_i - A x_i ||_1) 1.061597e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.333992e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.061597e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.333992e+00 (SUCCESS) Test #2594: mpi_dst_example_simple_lap_s_facto2_sched1_not_svdend ...................***Timeout 345.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.618148e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.880296e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.380672e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.745230e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.995904e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.239381e-01 s Time to initialize coeftab 3.232725e-01 s Time to factorize 3.740577e+00 s ( 2.67 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.204193e+00 s Time for refinement 3.937221e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.714123e-07 max(|| b_i - A x_i ||_1) 7.436639e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.344809e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.714123e-07 max(|| b_i - A x_i ||_1) 7.436639e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.344809e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.714123e-07 max(|| b_i - A x_i ||_1) 7.436639e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.344809e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.714123e-07 max(|| b_i - A x_i ||_1) 7.436639e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.344809e-01 (SUCCESS) Test #2598: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_svdend .......***Timeout 345.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.950027e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.560043e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.067860e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.029373e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.182965e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.143883e-02 s Time to initialize coeftab 3.975236e-01 s Time to factorize 1.781700e+00 s ( 5.60 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.246916e+00 s Time for refinement 1.158277e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.718524e-07 max(|| b_i - A x_i ||_1) 7.486536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.407510e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.718524e-07 max(|| b_i - A x_i ||_1) 7.486536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.407510e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.718524e-07 max(|| b_i - A x_i ||_1) 7.486536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.407510e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.718524e-07 max(|| b_i - A x_i ||_1) 7.486536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.407510e-01 (SUCCESS) Test #2603: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpbegin ...***Timeout 348.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.546496e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.414595e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.212742e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.590503e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.717733e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.654906e-01 s Time to initialize coeftab 3.982129e-01 s Time to factorize 2.730597e+00 s ( 3.66 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 3.033200e+00 s - iteration 1 : total iteration time 4.06 s error 9.0571e-11 Time for refinement 7.015617e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.251342e-08 max(|| b_i - A x_i ||_1) 2.986506e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.752815e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.251342e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.251342e-08 max(|| b_i - A x_i ||_1) 2.986506e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.752815e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.251342e-08 max(|| b_i - A x_i ||_1) 2.986506e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.752815e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.986506e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.752815e-01 (SUCCESS) Test #2605: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrcpbegin ...............***Timeout 349.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.420626e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.790626e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.227321e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.178851e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.627047e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.076848e-01 s Time to initialize coeftab 1.366144e+00 s Time to factorize 2.219644e+00 s ( 4.50 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 7.979672e-01 s - iteration 1 : total iteration time 2.48 s error 9.004e-11 Time for refinement 5.851650e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.113444e-08 max(|| b_i - A x_i ||_1) 2.958977e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.718222e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.113444e-08 max(|| b_i - A x_i ||_1) 2.958977e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.718222e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.113444e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.113444e-08 max(|| b_i - A x_i ||_1) 2.958977e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.718222e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.958977e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.718222e-01 (SUCCESS) 2546/3626 Test #2715: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrrtbegin ..............***Timeout 352.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.259370e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.348926e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.734506e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.074208e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.432505e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.470357e-02 s Time to initialize coeftab 3.751125e-01 s Time to factorize 3.745240e+00 s ( 2.67 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 7.003438e-01 s - iteration 1 : total iteration time 1.33 s error 1.127e-12 - iteration 2 : total iteration time 1.95 s error 3.6788e-18 Time for refinement 7.782173e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.142734e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.142734e-16 max(|| b_i - A x_i ||_1) 5.896840e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.409887e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.142734e-16 max(|| b_i - A x_i ||_1) 5.896840e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.409887e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 5.896840e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.409887e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.142734e-16 max(|| b_i - A x_i ||_1) 5.896840e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.409887e-04 (SUCCESS) Start 2715: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrrtbegin 2546/3626 Test #2751: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpilu0 ...............***Timeout 378.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.637752e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.894618e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.286185e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.332536e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.749429e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.118368e-01 s Time to initialize coeftab 9.709398e-02 s Time to factorize 1.940420e+01 s ( 1.05 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.895542e-01 s Time for refinement 4.389079e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.369788e-07 max(|| b_i - A x_i ||_1) 1.552073e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.916420e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.369788e-07 max(|| b_i - A x_i ||_1) 1.552073e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.916420e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.369788e-07 max(|| b_i - A x_i ||_1) 1.552073e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.916420e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.369788e-07 max(|| b_i - A x_i ||_1) 1.552073e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.916420e+00 (SUCCESS) Start 2751: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpilu0 2546/3626 Test #2782: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrrtend .....***Timeout 398.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.108844e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.412213e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.380107e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.231485e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.383322e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.041502e-01 s Time to initialize coeftab 9.463913e-02 s Time to factorize 6.431874e+00 s ( 3.31 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.943189e-01 s Time for refinement 2.446710e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.011343e-07 max(|| b_i - A x_i ||_1) 1.295686e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.269467e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.011343e-07 max(|| b_i - A x_i ||_1) 1.295686e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.269467e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.011343e-07 max(|| b_i - A x_i ||_1) 1.295686e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.269467e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.011343e-07 max(|| b_i - A x_i ||_1) 1.295686e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.269467e+00 (SUCCESS) Start 2782: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrrtend 2546/3626 Test #2783: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpilu0 ...............***Timeout 398.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.115244e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.019087e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.268082e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.157147e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.329527e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.136915e-01 s Time to initialize coeftab 1.853300e-01 s Time to factorize 4.920855e+00 s ( 4.33 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.297025e-01 s - iteration 1 : total iteration time 4.84 s error 1.6416e-11 Time for refinement 7.188523e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.344876e-08 max(|| b_i - A x_i ||_1) 3.198985e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.072151e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.344876e-08 max(|| b_i - A x_i ||_1) 3.198985e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.072151e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.344876e-08 max(|| b_i - A x_i ||_1) 3.198985e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.072151e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.344876e-08 max(|| b_i - A x_i ||_1) 3.198985e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.072151e-01 (SUCCESS) Start 2783: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpilu0 2546/3626 Test #2788: mpi_dst_example_simple_lap_c_facto2_sched1_kway_svdend ..................***Timeout 401.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.414566e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.710914e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.263909e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.977627e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.572647e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.983882e-01 s Time to initialize coeftab 2.114246e+00 s Time to factorize 1.089190e+01 s ( 3.67 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.797101e+00 s Time for refinement 4.990744e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.739004e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.739004e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.739004e-07 max(|| b_i - A x_i ||_1) 7.579402e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.912547e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 7.579402e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.912547e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.739004e-07 max(|| b_i - A x_i ||_1) 7.579402e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.912547e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 7.579402e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.912547e+00 (SUCCESS) Start 2788: mpi_dst_example_simple_lap_c_facto2_sched1_kway_svdend 2546/3626 Test #2790: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_svdend .......***Timeout 401.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.120429e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.393661e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.537351e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.568893e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.444402e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.747899e-01 s Time to initialize coeftab 6.130315e-01 s Time to factorize 7.487830e+00 s ( 5.34 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.687736e+00 s Time for refinement 1.455386e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.796778e-07 max(|| b_i - A x_i ||_1) 7.762101e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.958648e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.796778e-07 max(|| b_i - A x_i ||_1) 7.762101e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.958648e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.796778e-07 max(|| b_i - A x_i ||_1) 7.762101e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.958648e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.796778e-07 max(|| b_i - A x_i ||_1) 7.762101e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.958648e+00 (SUCCESS) Start 2790: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_svdend 2546/3626 Test #2820: mpi_dst_example_simple_lap_c_facto3_sched1_kway_svdend ..................***Timeout 416.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.645386e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.872743e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.327810e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.052431e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.772076e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.251652e-01 s Time to initialize coeftab 8.413438e-02 s Time to factorize 9.552762e+00 s ( 2.12 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.462074e-01 s Time for refinement 2.087068e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.933571e-07 max(|| b_i - A x_i ||_1) 8.695780e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.194248e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.933571e-07 max(|| b_i - A x_i ||_1) 8.695780e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.194248e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.933571e-07 max(|| b_i - A x_i ||_1) 8.695780e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.194248e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.933571e-07 max(|| b_i - A x_i ||_1) 8.695780e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.194248e+00 (SUCCESS) Start 2820: mpi_dst_example_simple_lap_c_facto3_sched1_kway_svdend 2546/3626 Test #2822: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_svdend .......***Timeout 416.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.562556e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.825406e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.168969e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.559051e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.178512e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.275316e-01 s Time to initialize coeftab 6.730384e-01 s Time to factorize 5.026693e+00 s ( 4.03 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.659058e+00 s Time for refinement 1.511598e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.925694e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.925694e-07 max(|| b_i - A x_i ||_1) 8.651022e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.182954e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.651022e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.182954e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.925694e-07 max(|| b_i - A x_i ||_1) 8.651022e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.182954e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.925694e-07 max(|| b_i - A x_i ||_1) 8.651022e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.182954e+00 (SUCCESS) Start 2822: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_svdend 2546/3626 Test #2824: mpi_dst_example_simple_lap_c_facto3_sched1_not_pqrcpend .................***Timeout 417.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.808114e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.767917e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.266124e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.068540e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.063935e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.749780e-01 s Time to initialize coeftab 1.379976e-01 s Time to factorize 2.154723e+00 s ( 9.41 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.940297e-01 s Time for refinement 1.476091e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.790354e-07 max(|| b_i - A x_i ||_1) 9.792068e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.470879e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.790354e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.790354e-07 max(|| b_i - A x_i ||_1) 9.792068e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.470879e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.790354e-07 max(|| b_i - A x_i ||_1) 9.792068e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.470879e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.792068e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.470879e+00 (SUCCESS) Start 2824: mpi_dst_example_simple_lap_c_facto3_sched1_not_pqrcpend 2546/3626 Test #2825: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpbegin ..............***Timeout 417.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.576356e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.696699e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.513196e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.409596e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.760056e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.640272e-02 s Time to initialize coeftab 3.258536e-01 s Time to factorize 1.852051e+01 s ( 1.10 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.623969e-01 s - iteration 1 : total iteration time 4.42 s error 4.755e-11 Time for refinement 6.912548e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.388149e-08 max(|| b_i - A x_i ||_1) 3.232536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.156815e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.388149e-08 max(|| b_i - A x_i ||_1) 3.232536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.156815e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.388149e-08 max(|| b_i - A x_i ||_1) 3.232536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.156815e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.388149e-08 max(|| b_i - A x_i ||_1) 3.232536e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.156815e-01 (SUCCESS) Start 2825: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpbegin 2546/3626 Test #2840: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpend .....***Timeout 421.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.194852e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.481405e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.056182e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.149214e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.589290e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.457455e-01 s Time to initialize coeftab 8.748642e-01 s Time to factorize 1.005250e+01 s ( 2.02 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.344982e-01 s Time for refinement 2.114328e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.195462e-07 max(|| b_i - A x_i ||_1) 1.199572e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.026938e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.195462e-07 max(|| b_i - A x_i ||_1) 1.199572e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.026938e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.195462e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.195462e-07 max(|| b_i - A x_i ||_1) 1.199572e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.026938e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.199572e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.026938e+00 (SUCCESS) Start 2840: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpend 2546/3626 Test #2852: mpi_dst_example_simple_lap_c_facto4_sched1_kway_svdend ..................***Timeout 425.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.302418e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.310785e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.410781e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.128900e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.594546e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.182348e-01 s Time to initialize coeftab 2.567842e-01 s Time to factorize 2.725921e+00 s ( 7.82 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.091008e+00 s Time for refinement 3.238839e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.825130e-07 max(|| b_i - A x_i ||_1) 7.987474e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.015517e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.825130e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.825130e-07 max(|| b_i - A x_i ||_1) 7.987474e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.015517e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.825130e-07 max(|| b_i - A x_i ||_1) 7.987474e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.015517e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 7.987474e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.015517e+00 (SUCCESS) Start 2852: mpi_dst_example_simple_lap_c_facto4_sched1_kway_svdend 2546/3626 Test #2857: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpbegin ..............***Timeout 426.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.155065e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.454610e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.664913e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.671497e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.372492e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.246149e-01 s Time to initialize coeftab 2.552997e-01 s Time to factorize 5.191242e+00 s ( 4.10 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.369176e-01 s - iteration 1 : total iteration time 2.79 s error 1.6952e-11 Time for refinement 8.980216e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.338405e-08 max(|| b_i - A x_i ||_1) 3.195903e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.064376e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.338405e-08 max(|| b_i - A x_i ||_1) 3.195903e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.064376e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.338405e-08 max(|| b_i - A x_i ||_1) 3.195903e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.064376e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.338405e-08 max(|| b_i - A x_i ||_1) 3.195903e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.064376e-01 (SUCCESS) Start 2857: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpbegin 2546/3626 Test #2859: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpbegin ...***Timeout 426.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.227174e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.521165e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.966204e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.238597e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.490023e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.554040e-01 s Time to initialize coeftab 2.979280e-01 s Time to factorize 4.669930e+00 s ( 4.56 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.974274e+00 s - iteration 1 : total iteration time 2.4 s error 5.3212e-11 Time for refinement 4.376745e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.477419e-08 max(|| b_i - A x_i ||_1) 3.194506e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.060849e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.477419e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.477419e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.477419e-08 max(|| b_i - A x_i ||_1) 3.194506e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.060849e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.194506e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.060849e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.194506e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.060849e-01 (SUCCESS) Start 2859: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpbegin Test #2573: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpend .....***Timeout 432.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.674476e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.666803e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.909269e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.142655e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.842645e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.578852e-01 s Time to initialize coeftab 1.718306e+00 s Time to factorize 1.031515e+00 s ( 5.07 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 6.770077e-01 s Time for refinement 9.250099e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.634545e-07 max(|| b_i - A x_i ||_1) 7.302314e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.176016e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.634545e-07 max(|| b_i - A x_i ||_1) 7.302314e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.176016e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.634545e-07 max(|| b_i - A x_i ||_1) 7.302314e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.176016e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.634545e-07 max(|| b_i - A x_i ||_1) 7.302314e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.176016e-01 (SUCCESS) Test #2576: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrcpbegin ..............***Timeout 432.67 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.106210e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.078585e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.316059e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.630057e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.326436e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.040418e-01 s Time to initialize coeftab 2.501676e-01 s Time to factorize 3.079496e+00 s ( 1.70 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 3.807013e+00 s - iteration 1 : total iteration time 4.31 s error 4.7136e-11 Time for refinement 7.660855e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.952382e-08 max(|| b_i - A x_i ||_1) 2.921333e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670918e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.952382e-08 max(|| b_i - A x_i ||_1) 2.921333e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670918e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.952382e-08 max(|| b_i - A x_i ||_1) 2.921333e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670918e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.952382e-08 max(|| b_i - A x_i ||_1) 2.921333e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.670918e-01 (SUCCESS) Test #2616: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpend .....***Timeout 432.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.374073e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.513269e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.685264e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.117140e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.544496e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.656249e-02 s Time to initialize coeftab 2.157070e-01 s Time to factorize 2.770871e+00 s ( 3.60 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 7.410272e-01 s - iteration 1 : total iteration time 5.1 s error 6.7804e-13 Time for refinement 1.226980e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.422028e-08 max(|| b_i - A x_i ||_1) 2.681928e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.370085e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.422028e-08 max(|| b_i - A x_i ||_1) 2.681928e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.370085e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.422028e-08 max(|| b_i - A x_i ||_1) 2.681928e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.370085e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.422028e-08 max(|| b_i - A x_i ||_1) 2.681928e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.370085e-01 (SUCCESS) Start 2616: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpend Test #2631: mpi_dst_example_simple_lap_d_facto0_sched1_not_pqrcpbegin ...............***Timeout 432.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.203699e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.208451e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.403045e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.952170e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.440035e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.780761e-01 s Time to initialize coeftab 6.276531e-01 s Time to factorize 6.443362e+00 s (804.52 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.216893e-01 s - iteration 1 : total iteration time 2.31 s error 1.8133e-14 Time for refinement 7.034165e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813666e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813666e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813666e-14 max(|| b_i - A x_i ||_1) 3.265056e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.102824e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.265056e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.102824e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.265056e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.102824e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.813666e-14 max(|| b_i - A x_i ||_1) 3.265056e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.102824e-02 (SUCCESS) Start 2631: mpi_dst_example_simple_lap_d_facto0_sched1_not_pqrcpbegin Test #2642: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpend .....***Timeout 432.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.388656e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.772922e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.425756e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.319842e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.790217e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.163588e-01 s Time to initialize coeftab 2.603190e-01 s Time to factorize 3.269094e+00 s ( 1.55 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.065980e-01 s - iteration 1 : total iteration time 2.12 s error 3.3332e-15 Time for refinement 5.925597e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.336356e-15 max(|| b_i - A x_i ||_1) 3.467739e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.357512e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.336356e-15 max(|| b_i - A x_i ||_1) 3.467739e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.357512e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.336356e-15 max(|| b_i - A x_i ||_1) 3.467739e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.357512e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.336356e-15 max(|| b_i - A x_i ||_1) 3.467739e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.357512e-03 (SUCCESS) Start 2642: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpend Test #2669: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrcpbegin ...............***Timeout 432.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.360579e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.478444e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.083803e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.894184e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.500895e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.722064e-01 s Time to initialize coeftab 3.549031e-01 s Time to factorize 4.671784e+00 s ( 1.12 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Time to solve 1.153064e+00 s - iteration 1 : total iteration time 3.58 s error 3.6996e-14 Time for refinement 6.487128e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.699876e-14 max(|| b_i - A x_i ||_1) 6.933681e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.712766e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.699876e-14 max(|| b_i - A x_i ||_1) 6.933681e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.712766e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.699876e-14 max(|| b_i - A x_i ||_1) 6.933681e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.712766e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.699876e-14 max(|| b_i - A x_i ||_1) 6.933681e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.712766e-02 (SUCCESS) Start 2669: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrcpbegin Test #2683: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrrtbegin ..............***Timeout 433.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.222604e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.690575e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.171567e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.150774e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.358590e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.104945e-01 s Time to initialize coeftab 3.591601e-01 s Time to factorize 1.422920e+00 s ( 3.68 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.508933e+00 s - iteration 1 : total iteration time 1.83 s error 3.5345e-13 Time for refinement 5.373773e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.534523e-13 max(|| b_i - A x_i ||_1) 7.028430e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.831827e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.534523e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.534523e-13 max(|| b_i - A x_i ||_1) 7.028430e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.831827e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.028430e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.831827e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.534523e-13 max(|| b_i - A x_i ||_1) 7.028430e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.831827e-01 (SUCCESS) Start 2683: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrrtbegin Test #2578: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpbegin ...***Timeout 433.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.335559e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.065218e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.244840e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.304247e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.482162e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.626594e-01 s Time to initialize coeftab 4.208080e-01 s Time to factorize 2.001649e+00 s ( 2.61 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 3.003954e+00 s - iteration 1 : total iteration time 4.14 s error 4.9333e-11 Time for refinement 7.326711e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.873218e-08 max(|| b_i - A x_i ||_1) 2.901490e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.645985e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.873218e-08 max(|| b_i - A x_i ||_1) 2.901490e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.645985e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.873218e-08 max(|| b_i - A x_i ||_1) 2.901490e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.645985e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.873218e-08 max(|| b_i - A x_i ||_1) 2.901490e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.645985e-01 (SUCCESS) Test #2588: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrrtbegin ..............***Timeout 433.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.498791e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.835012e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.173512e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.061727e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.696690e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.040409e-01 s Time to initialize coeftab 1.860662e-01 s Time to factorize 1.993950e+00 s ( 2.62 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.898740e-01 s - iteration 1 : total iteration time 1.97 s error 5.3333e-11 Time for refinement 4.256512e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.852116e-08 max(|| b_i - A x_i ||_1) 2.881764e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.621197e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.852116e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.852116e-08 max(|| b_i - A x_i ||_1) 2.881764e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.621197e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.852116e-08 max(|| b_i - A x_i ||_1) 2.881764e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.621197e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.881764e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.621197e-01 (SUCCESS) Test #2589: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrrtend ................***Timeout 433.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.386768e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.736148e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.314741e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.314455e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.723996e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.991280e-02 s Time to initialize coeftab 1.044874e+00 s Time to factorize 7.578817e-01 s ( 6.91 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.279596e-01 s - iteration 1 : total iteration time 2.78 s error 1.1859e-11 Time for refinement 5.415604e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.938747e-08 max(|| b_i - A x_i ||_1) 2.914929e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.662871e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.938747e-08 max(|| b_i - A x_i ||_1) 2.914929e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.662871e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.938747e-08 max(|| b_i - A x_i ||_1) 2.914929e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.662871e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.938747e-08 max(|| b_i - A x_i ||_1) 2.914929e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.662871e-01 (SUCCESS) Start 2917: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_svdbegin Start 2918: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_svdend Start 2919: mpi_dst_example_simple_lap_z_facto1_sched1_not_pqrcpbegin Start 2920: mpi_dst_example_simple_lap_z_facto1_sched1_not_pqrcpend Start 2921: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpbegin Start 2922: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpend Start 2923: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpbegin Start 2924: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpend Start 2925: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrcpbegin Start 2926: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrcpend Start 2927: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrcpbegin Start 2928: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrcpend Start 2929: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpbegin Start 2930: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpend Start 2931: mpi_dst_example_simple_lap_z_facto1_sched1_not_tqrcpbegin Start 2932: mpi_dst_example_simple_lap_z_facto1_sched1_not_tqrcpend Start 2933: mpi_dst_example_simple_lap_z_facto1_sched1_kway_tqrcpbegin Start 2934: mpi_dst_example_simple_lap_z_facto1_sched1_kway_tqrcpend Start 2935: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpbegin Start 2936: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpend Start 2937: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrrtbegin Start 2938: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrrtend Start 2939: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrrtbegin Start 2940: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrrtend Start 2941: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtbegin Start 2942: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtend Start 2943: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpilu0 Start 2944: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpilu1 Start 2945: mpi_dst_example_simple_lap_z_facto2_sched1_not_svdbegin Start 2946: mpi_dst_example_simple_lap_z_facto2_sched1_not_svdend Start 2947: mpi_dst_example_simple_lap_z_facto2_sched1_kway_svdbegin Start 2948: mpi_dst_example_simple_lap_z_facto2_sched1_kway_svdend Start 2949: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_svdbegin Start 2950: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_svdend Start 2951: mpi_dst_example_simple_lap_z_facto2_sched1_not_pqrcpbegin Start 2952: mpi_dst_example_simple_lap_z_facto2_sched1_not_pqrcpend Start 2953: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpbegin Start 2954: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpend Start 2955: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpbegin Start 2956: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpend Start 2957: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrcpbegin Start 2958: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrcpend Start 2959: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrcpbegin Start 2960: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrcpend Start 2961: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpbegin Start 2962: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpend Start 2963: mpi_dst_example_simple_lap_z_facto2_sched1_not_tqrcpbegin Start 2964: mpi_dst_example_simple_lap_z_facto2_sched1_not_tqrcpend Start 2965: mpi_dst_example_simple_lap_z_facto2_sched1_kway_tqrcpbegin Start 2966: mpi_dst_example_simple_lap_z_facto2_sched1_kway_tqrcpend Start 2967: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpbegin Start 2968: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpend Start 2969: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrrtbegin Start 2970: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrrtend Start 2971: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrrtbegin Start 2972: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrrtend Start 2973: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtbegin Start 2974: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtend Start 2975: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpilu0 Start 2976: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpilu1 Start 2977: mpi_dst_example_simple_lap_z_facto3_sched1_not_svdbegin Start 2978: mpi_dst_example_simple_lap_z_facto3_sched1_not_svdend Start 2979: mpi_dst_example_simple_lap_z_facto3_sched1_kway_svdbegin Start 2980: mpi_dst_example_simple_lap_z_facto3_sched1_kway_svdend Start 2981: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_svdbegin Start 2982: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_svdend Start 2983: mpi_dst_example_simple_lap_z_facto3_sched1_not_pqrcpbegin Start 2984: mpi_dst_example_simple_lap_z_facto3_sched1_not_pqrcpend Start 2985: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpbegin Start 2986: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpend Start 2987: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpbegin Start 2988: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpend Start 2989: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrcpbegin Start 2990: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrcpend Start 2991: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrcpbegin Start 2992: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrcpend Start 2993: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpbegin Start 2994: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpend Start 2995: mpi_dst_example_simple_lap_z_facto3_sched1_not_tqrcpbegin Start 2996: mpi_dst_example_simple_lap_z_facto3_sched1_not_tqrcpend Start 2997: mpi_dst_example_simple_lap_z_facto3_sched1_kway_tqrcpbegin Start 2998: mpi_dst_example_simple_lap_z_facto3_sched1_kway_tqrcpend Start 2999: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpbegin Start 3000: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpend Start 3001: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrrtbegin Start 3002: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrrtend Start 3003: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrrtbegin Start 3004: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrrtend Start 3005: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtbegin Start 3006: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtend Start 3007: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpilu0 Start 3008: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpilu1 Start 3009: mpi_dst_example_simple_lap_z_facto4_sched1_not_svdbegin Start 3010: mpi_dst_example_simple_lap_z_facto4_sched1_not_svdend Start 3011: mpi_dst_example_simple_lap_z_facto4_sched1_kway_svdbegin Start 3012: mpi_dst_example_simple_lap_z_facto4_sched1_kway_svdend Start 3013: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_svdbegin Start 3014: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_svdend Start 3015: mpi_dst_example_simple_lap_z_facto4_sched1_not_pqrcpbegin Start 3016: mpi_dst_example_simple_lap_z_facto4_sched1_not_pqrcpend Start 3017: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpbegin Start 3018: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpend Start 3019: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpbegin Start 3020: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpend Start 3021: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrcpbegin Start 3022: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrcpend Start 3023: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrcpbegin Start 3024: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrcpend Start 3025: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpbegin Start 3026: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpend Start 3027: mpi_dst_example_simple_lap_z_facto4_sched1_not_tqrcpbegin Start 3028: mpi_dst_example_simple_lap_z_facto4_sched1_not_tqrcpend Start 3029: mpi_dst_example_simple_lap_z_facto4_sched1_kway_tqrcpbegin Start 3030: mpi_dst_example_simple_lap_z_facto4_sched1_kway_tqrcpend Start 3031: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpbegin Start 3032: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpend Start 3033: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrrtbegin Start 3034: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrrtend Start 3035: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrrtbegin Start 3036: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrrtend Start 3037: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtbegin Start 3038: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtend Start 3039: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpilu0 Start 3040: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpilu1 Start 3041: mpi_dst_example_simple_lap_s_facto0_sched4_not_svdbegin Start 3042: mpi_dst_example_simple_lap_s_facto0_sched4_not_svdend Start 3043: mpi_dst_example_simple_lap_s_facto0_sched4_kway_svdbegin Start 3044: mpi_dst_example_simple_lap_s_facto0_sched4_kway_svdend Start 3045: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_svdbegin Start 3046: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_svdend Start 3047: mpi_dst_example_simple_lap_s_facto0_sched4_not_pqrcpbegin Start 3048: mpi_dst_example_simple_lap_s_facto0_sched4_not_pqrcpend Start 3049: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpbegin Start 3050: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpend Start 3051: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpbegin Start 3052: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpend Start 3053: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrcpbegin Start 3054: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrcpend Start 3055: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrcpbegin Start 3056: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrcpend Start 3057: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpbegin Start 3058: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpend Start 3059: mpi_dst_example_simple_lap_s_facto0_sched4_not_tqrcpbegin Start 3060: mpi_dst_example_simple_lap_s_facto0_sched4_not_tqrcpend Start 3061: mpi_dst_example_simple_lap_s_facto0_sched4_kway_tqrcpbegin Start 3062: mpi_dst_example_simple_lap_s_facto0_sched4_kway_tqrcpend Test #2518: mpi_dst_example_simple_lap_z_facto4_sched0_kway_tqrcpbegin ..............***Timeout 447.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2521: mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpend .....***Timeout 447.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2522: mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrrtbegin ...............***Timeout 447.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2524: mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrrtbegin ..............***Timeout 447.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2528: mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpilu0 ...............***Timeout 447.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2532: mpi_dst_example_simple_lap_s_facto0_sched1_kway_svdbegin ................***Timeout 446.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.386893e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.745299e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.249613e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.781040e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.623795e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.872714e-01 s Time to initialize coeftab 5.634744e-01 s Time to factorize 2.243619e+01 s (231.05 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.288535e+00 s Test #2536: mpi_dst_example_simple_lap_s_facto0_sched1_not_pqrcpbegin ...............***Timeout 446.64 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2538: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpbegin ..............***Timeout 446.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.204139e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.488757e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.104542e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.987263e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.510835e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.453083e-02 s Time to initialize coeftab 1.000672e+00 s Test #2540: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpbegin ...***Timeout 446.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.562762e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.876892e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.616617e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.812785e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.682274e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.726477e-02 s Time to initialize coeftab 6.500891e-01 s Time to factorize 6.508101e+00 s (796.51 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.516276e+00 s Test #2541: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpend .....***Timeout 446.55 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2542: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrcpbegin ...............***Timeout 446.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2543: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrcpend .................***Timeout 446.52 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2545: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrcpend ................***Timeout 446.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2546: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpbegin ...***Timeout 446.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2549: mpi_dst_example_simple_lap_s_facto0_sched1_not_tqrcpend .................***Timeout 446.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2552: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpbegin ...***Timeout 445.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2555: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrrtend .................***Timeout 444.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2557: mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrrtend ................***Timeout 444.79 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2558: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtbegin ...***Timeout 444.78 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2559: mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtend .....***Timeout 444.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2560: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpilu0 ...............***Timeout 444.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2562: mpi_dst_example_simple_lap_s_facto1_sched1_not_svdbegin .................***Timeout 444.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2566: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_svdbegin .....***Timeout 443.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.421631e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.639222e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.626835e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.791561e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.589265e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.934372e-01 s Time to initialize coeftab 4.705999e-01 s Time to factorize 1.598667e+01 s (335.22 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Test #2569: mpi_dst_example_simple_lap_s_facto1_sched1_not_pqrcpend .................***Timeout 442.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2570: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpbegin ..............***Timeout 442.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2575: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrcpend .................***Timeout 441.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2611: mpi_dst_example_simple_lap_s_facto2_sched1_not_tqrcpbegin ...............***Timeout 440.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2611: mpi_dst_example_simple_lap_s_facto2_sched1_not_tqrcpbegin Test #2612: mpi_dst_example_simple_lap_s_facto2_sched1_not_tqrcpend .................***Timeout 440.98 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2612: mpi_dst_example_simple_lap_s_facto2_sched1_not_tqrcpend Test #2614: mpi_dst_example_simple_lap_s_facto2_sched1_kway_tqrcpend ................***Timeout 440.97 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2614: mpi_dst_example_simple_lap_s_facto2_sched1_kway_tqrcpend Test #2620: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrrtend ................***Timeout 439.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2620: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrrtend Test #2622: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtend .....***Timeout 439.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.564221e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.537530e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.627499e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.604945e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.883712e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.726756e-01 s Time to initialize coeftab 4.744648e-01 s Time to factorize 1.939362e+00 s ( 5.15 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.453012e+00 s Start 2622: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtend Test #2633: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpbegin ..............***Timeout 437.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.805684e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.003891e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.730463e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.706901e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.997920e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.459628e-01 s Time to initialize coeftab 7.018250e-01 s Time to factorize 8.881871e+00 s (583.64 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.698888e+00 s Start 2633: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpbegin Test #2636: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_pqrcpend .....***Timeout 437.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2636: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_pqrcpend Test #2637: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrcpbegin ...............***Timeout 437.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.552655e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.443965e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.622702e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.160091e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.746570e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.603318e-01 s Time to initialize coeftab 7.021753e-01 s Time to factorize 1.542978e+01 s (335.96 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Start 2637: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrcpbegin Test #2641: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpbegin ...***Timeout 437.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2641: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpbegin Test #2647: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpbegin ...***Timeout 437.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2647: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpbegin Test #2649: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrrtbegin ...............***Timeout 437.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2649: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrrtbegin Test #2655: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpilu0 ...............***Timeout 437.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2655: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpilu0 Test #2659: mpi_dst_example_simple_lap_d_facto1_sched1_kway_svdbegin ................***Timeout 436.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2659: mpi_dst_example_simple_lap_d_facto1_sched1_kway_svdbegin Test #2660: mpi_dst_example_simple_lap_d_facto1_sched1_kway_svdend ..................***Timeout 436.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2660: mpi_dst_example_simple_lap_d_facto1_sched1_kway_svdend Test #2662: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_svdend .......***Timeout 436.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.868183e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.524798e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.437810e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.243570e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.263002e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.333477e-01 s Time to initialize coeftab 6.257838e-01 s Time to factorize 4.380076e+00 s ( 1.19 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.013421e+00 s Start 2662: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_svdend Test #2667: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpbegin ...***Timeout 436.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2667: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpbegin Test #2670: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrcpend .................***Timeout 436.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.915724e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.885309e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Start 2670: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrcpend Test #2671: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrcpbegin ..............***Timeout 436.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2671: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrcpbegin Test #2678: mpi_dst_example_simple_lap_d_facto1_sched1_kway_tqrcpend ................***Timeout 436.78 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2678: mpi_dst_example_simple_lap_d_facto1_sched1_kway_tqrcpend Test #2684: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrrtend ................***Timeout 436.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2684: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrrtend Test #2686: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtend .....***Timeout 436.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2686: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtend Test #2689: mpi_dst_example_simple_lap_d_facto2_sched1_not_svdbegin .................***Timeout 436.71 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.277106e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.470718e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.438595e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.436799e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.702963e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.015888e+00 s Time to initialize coeftab 8.350456e-01 s Time to factorize 5.105238e+00 s ( 1.96 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.093516e+00 s Start 2689: mpi_dst_example_simple_lap_d_facto2_sched1_not_svdbegin Test #2696: mpi_dst_example_simple_lap_d_facto2_sched1_not_pqrcpend .................***Timeout 436.66 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2696: mpi_dst_example_simple_lap_d_facto2_sched1_not_pqrcpend Test #2697: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpbegin ..............***Timeout 436.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.812111e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.183837e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.189752e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.589051e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.109693e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.100976e-01 s Time to initialize coeftab 5.279946e-01 s Time to factorize 4.018461e+00 s ( 2.48 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.714975e+00 s Start 2697: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpbegin Test #2700: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpend .....***Timeout 436.64 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2700: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpend Test #2703: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrcpbegin ..............***Timeout 436.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2703: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrcpbegin Test #2580: mpi_dst_example_simple_lap_s_facto1_sched1_not_tqrcpbegin ...............***Timeout 436.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2581: mpi_dst_example_simple_lap_s_facto1_sched1_not_tqrcpend .................***Timeout 436.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2583: mpi_dst_example_simple_lap_s_facto1_sched1_kway_tqrcpend ................***Timeout 436.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2584: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpbegin ...***Timeout 436.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2587: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrrtend .................***Timeout 436.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2599: mpi_dst_example_simple_lap_s_facto2_sched1_not_pqrcpbegin ...............***Timeout 436.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2600: mpi_dst_example_simple_lap_s_facto2_sched1_not_pqrcpend .................***Timeout 436.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2602: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpend ................***Timeout 436.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2604: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpend .....***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2708: mpi_dst_example_simple_lap_d_facto2_sched1_not_tqrcpend .................***Timeout 436.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2708: mpi_dst_example_simple_lap_d_facto2_sched1_not_tqrcpend Test #2709: mpi_dst_example_simple_lap_d_facto2_sched1_kway_tqrcpbegin ..............***Timeout 436.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.376303e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.007483e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.438121e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.487159e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.773695e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.790185e-01 s Time to initialize coeftab 4.603496e-01 s Time to factorize 4.530906e+00 s ( 2.20 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Time to solve 1.485857e+00 s - iteration 1 : total iteration time 9.25 s error 6.8686e-14 Start 2709: mpi_dst_example_simple_lap_d_facto2_sched1_kway_tqrcpbegin Test #2710: mpi_dst_example_simple_lap_d_facto2_sched1_kway_tqrcpend ................***Timeout 436.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2710: mpi_dst_example_simple_lap_d_facto2_sched1_kway_tqrcpend Test #2712: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpend .....***Timeout 436.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2712: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpend 2586/3626 Test #2714: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrrtend .................***Timeout 436.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2714: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrrtend 2586/3626 Test #2716: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrrtend ................***Timeout 436.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2716: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrrtend 2586/3626 Test #2717: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtbegin ...***Timeout 436.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2717: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtbegin 2586/3626 Test #2718: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtend .....***Timeout 436.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2718: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtend 2586/3626 Test #2719: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpilu0 ...............***Timeout 436.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2719: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpilu0 2586/3626 Test #2720: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpilu1 ...............***Timeout 436.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2720: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpilu1 2586/3626 Test #2721: mpi_dst_example_simple_lap_c_facto0_sched1_not_svdbegin .................***Timeout 436.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2721: mpi_dst_example_simple_lap_c_facto0_sched1_not_svdbegin 2586/3626 Test #2722: mpi_dst_example_simple_lap_c_facto0_sched1_not_svdend ...................***Timeout 436.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2722: mpi_dst_example_simple_lap_c_facto0_sched1_not_svdend 2586/3626 Test #2723: mpi_dst_example_simple_lap_c_facto0_sched1_kway_svdbegin ................***Timeout 436.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2723: mpi_dst_example_simple_lap_c_facto0_sched1_kway_svdbegin 2586/3626 Test #2724: mpi_dst_example_simple_lap_c_facto0_sched1_kway_svdend ..................***Timeout 436.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.847155e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.611920e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.365890e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.660933e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.129129e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.721072e-01 s Time to initialize coeftab 3.917734e-01 s Start 2724: mpi_dst_example_simple_lap_c_facto0_sched1_kway_svdend 2586/3626 Test #2725: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_svdbegin .....***Timeout 436.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2725: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_svdbegin 2586/3626 Test #2726: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_svdend .......***Timeout 436.27 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2726: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_svdend 2586/3626 Test #2727: mpi_dst_example_simple_lap_c_facto0_sched1_not_pqrcpbegin ...............***Timeout 436.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.248944e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.620607e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.422973e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.862432e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.541413e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.756720e-02 s Time to initialize coeftab 7.374208e-01 s Start 2727: mpi_dst_example_simple_lap_c_facto0_sched1_not_pqrcpbegin 2586/3626 Test #2728: mpi_dst_example_simple_lap_c_facto0_sched1_not_pqrcpend .................***Timeout 436.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.543255e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.187604e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.487864e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.913642e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.778035e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.452649e-01 s Time to initialize coeftab 4.953176e-01 s Time to factorize 7.672678e+00 s ( 2.64 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.233278e+00 s Time for refinement 4.650353e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.945133e-07 max(|| b_i - A x_i ||_1) 8.810800e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.223271e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.945133e-07 max(|| b_i - A x_i ||_1) 8.810800e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.223271e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.945133e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.945133e-07 max(|| b_i - A x_i ||_1) 8.810800e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.223271e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.810800e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.223271e+00 (SUCCESS) Start 2728: mpi_dst_example_simple_lap_c_facto0_sched1_not_pqrcpend 2586/3626 Test #2729: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpbegin ..............***Timeout 436.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 1: 300 1140 2: 200 760 3: 200 660 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2729: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpbegin 2586/3626 Test #2730: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpend ................***Timeout 436.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2730: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpend 2586/3626 Test #2731: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpbegin ...***Timeout 436.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2731: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpbegin 2586/3626 Test #2732: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpend .....***Timeout 436.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2732: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpend 2586/3626 Test #2733: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrcpbegin ...............***Timeout 436.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.168494e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.353032e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.387088e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.819915e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.533117e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.873868e-01 s Time to initialize coeftab 8.710309e-01 s Time to factorize 2.604908e+01 s (797.25 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.361795e+00 s - iteration 1 : total iteration time 8.85 s error 6.0683e-11 Start 2733: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrcpbegin 2586/3626 Test #2734: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrcpend .................***Timeout 436.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2734: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrcpend 2586/3626 Test #2735: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrcpbegin ..............***Timeout 436.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2735: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrcpbegin 2586/3626 Test #2736: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrcpend ................***Timeout 436.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2736: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrcpend 2586/3626 Test #2737: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrcpbegin ...***Timeout 436.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2737: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrcpbegin 2586/3626 Test #2738: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrcpend .....***Timeout 436.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2738: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrcpend 2586/3626 Test #2739: mpi_dst_example_simple_lap_c_facto0_sched1_not_tqrcpbegin ...............***Timeout 436.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.617869e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.290984e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.141312e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.168528e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.770580e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.340542e-01 s Time to initialize coeftab 9.100915e-01 s Start 2739: mpi_dst_example_simple_lap_c_facto0_sched1_not_tqrcpbegin 2586/3626 Test #2740: mpi_dst_example_simple_lap_c_facto0_sched1_not_tqrcpend .................***Timeout 436.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2740: mpi_dst_example_simple_lap_c_facto0_sched1_not_tqrcpend 2586/3626 Test #2741: mpi_dst_example_simple_lap_c_facto0_sched1_kway_tqrcpbegin ..............***Timeout 436.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.898095e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.281479e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.129196e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.400667e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.223041e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.892273e-01 s Time to initialize coeftab 1.480434e+00 s Start 2741: mpi_dst_example_simple_lap_c_facto0_sched1_kway_tqrcpbegin 2586/3626 Test #2742: mpi_dst_example_simple_lap_c_facto0_sched1_kway_tqrcpend ................***Timeout 436.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2742: mpi_dst_example_simple_lap_c_facto0_sched1_kway_tqrcpend 2586/3626 Test #2743: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpbegin ...***Timeout 436.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2743: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpbegin 2586/3626 Test #2744: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpend .....***Timeout 436.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.456444e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.012125e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.549449e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.349113e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.873508e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.439246e-01 s Time to initialize coeftab 5.555715e-01 s Time to factorize 1.295641e+01 s ( 1.57 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.785025e+00 s Time for refinement 5.361396e+00 s Start 2744: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpend 2586/3626 Test #2745: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrrtbegin ...............***Timeout 436.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.064385e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.239945e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.111918e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.219947e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.374664e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.953290e-02 s Time to initialize coeftab 1.073416e+00 s Start 2745: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrrtbegin 2586/3626 Test #2746: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrrtend .................***Timeout 436.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2746: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrrtend 2586/3626 Test #2747: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrrtbegin ..............***Timeout 436.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2747: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrrtbegin 2586/3626 Test #2748: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrrtend ................***Timeout 436.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2748: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrrtend 2586/3626 Test #2749: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtbegin ...***Timeout 436.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2749: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtbegin 2586/3626 Test #2750: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtend .....***Timeout 436.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2750: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtend 2586/3626 Test #2752: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpilu1 ...............***Timeout 436.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2752: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpilu1 2586/3626 Test #2753: mpi_dst_example_simple_lap_c_facto1_sched1_not_svdbegin .................***Timeout 436.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2753: mpi_dst_example_simple_lap_c_facto1_sched1_not_svdbegin 2586/3626 Test #2754: mpi_dst_example_simple_lap_c_facto1_sched1_not_svdend ...................***Timeout 436.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 2: 200 760 3: 200 660 1: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2754: mpi_dst_example_simple_lap_c_facto1_sched1_not_svdend 2586/3626 Test #2755: mpi_dst_example_simple_lap_c_facto1_sched1_kway_svdbegin ................***Timeout 436.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2755: mpi_dst_example_simple_lap_c_facto1_sched1_kway_svdbegin 2586/3626 Test #2756: mpi_dst_example_simple_lap_c_facto1_sched1_kway_svdend ..................***Timeout 436.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2756: mpi_dst_example_simple_lap_c_facto1_sched1_kway_svdend 2586/3626 Test #2757: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_svdbegin .....***Timeout 436.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2757: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_svdbegin 2586/3626 Test #2758: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_svdend .......***Timeout 436.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2758: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_svdend 2586/3626 Test #2759: mpi_dst_example_simple_lap_c_facto1_sched1_not_pqrcpbegin ...............***Timeout 436.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2759: mpi_dst_example_simple_lap_c_facto1_sched1_not_pqrcpbegin 2586/3626 Test #2760: mpi_dst_example_simple_lap_c_facto1_sched1_not_pqrcpend .................***Timeout 436.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2760: mpi_dst_example_simple_lap_c_facto1_sched1_not_pqrcpend 2586/3626 Test #2761: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpbegin ..............***Timeout 436.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2761: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpbegin 2586/3626 Test #2762: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpend ................***Timeout 436.35 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2762: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpend 2586/3626 Test #2763: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpbegin ...***Timeout 436.35 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.532823e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.994984e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.343885e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.496923e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.945025e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.471281e-01 s Time to initialize coeftab 2.281502e+00 s Time to factorize 9.803405e+00 s ( 2.17 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.499709e+00 s Start 2763: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpbegin 2586/3626 Test #2764: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpend .....***Timeout 436.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2764: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpend 2586/3626 Test #2765: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrcpbegin ...............***Timeout 436.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.571479e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.117102e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.821149e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.699484e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.706860e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.241104e-01 s Time to initialize coeftab 6.630952e-01 s Time to factorize 1.665778e+01 s ( 1.28 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2765: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrcpbegin 2586/3626 Test #2766: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrcpend .................***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2766: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrcpend 2586/3626 Test #2767: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrcpbegin ..............***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2767: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrcpbegin 2586/3626 Test #2768: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrcpend ................***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.504566e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.398856e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.000370e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.052763e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.737989e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.678065e-02 s Start 2768: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrcpend 2586/3626 Test #2769: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrcpbegin ...***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.409324e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.708321e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.659222e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.747447e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.686609e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.235567e-02 s Time to initialize coeftab 1.575625e+00 s Start 2769: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrcpbegin 2586/3626 Test #2770: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrcpend .....***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2770: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrcpend 2586/3626 Test #2771: mpi_dst_example_simple_lap_c_facto1_sched1_not_tqrcpbegin ...............***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2771: mpi_dst_example_simple_lap_c_facto1_sched1_not_tqrcpbegin 2586/3626 Test #2772: mpi_dst_example_simple_lap_c_facto1_sched1_not_tqrcpend .................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2772: mpi_dst_example_simple_lap_c_facto1_sched1_not_tqrcpend 2586/3626 Test #2773: mpi_dst_example_simple_lap_c_facto1_sched1_kway_tqrcpbegin ..............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2773: mpi_dst_example_simple_lap_c_facto1_sched1_kway_tqrcpbegin 2586/3626 Test #2774: mpi_dst_example_simple_lap_c_facto1_sched1_kway_tqrcpend ................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2774: mpi_dst_example_simple_lap_c_facto1_sched1_kway_tqrcpend 2586/3626 Test #2775: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_tqrcpbegin ...***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 3: 200 660 2: 200 760 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2775: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_tqrcpbegin 2586/3626 Test #2776: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_tqrcpend .....***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2776: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_tqrcpend 2586/3626 Test #2777: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrrtbegin ...............***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2777: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrrtbegin 2586/3626 Test #2778: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrrtend .................***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.076379e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.085991e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.768684e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.118925e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.424216e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.188607e-02 s Time to initialize coeftab 9.689343e-01 s Time to factorize 7.987911e+00 s ( 2.67 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2778: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrrtend 2586/3626 Test #2779: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrrtbegin ..............***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.638177e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.576602e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.340085e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.959280e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.877477e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.895630e-01 s Time to initialize coeftab 5.298429e-01 s Start 2779: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrrtbegin 2586/3626 Test #2780: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrrtend ................***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2780: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrrtend 2586/3626 Test #2781: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrrtbegin ...***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2781: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrrtbegin 2586/3626 Test #2784: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpilu1 ...............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2784: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpilu1 2586/3626 Test #2785: mpi_dst_example_simple_lap_c_facto2_sched1_not_svdbegin .................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2785: mpi_dst_example_simple_lap_c_facto2_sched1_not_svdbegin 2586/3626 Test #2786: mpi_dst_example_simple_lap_c_facto2_sched1_not_svdend ...................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.619750e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.589971e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.731963e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.459047e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.772875e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.255287e-01 s Time to initialize coeftab 3.040589e-01 s Time to factorize 9.050756e+00 s ( 4.42 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.729185e+00 s Start 2786: mpi_dst_example_simple_lap_c_facto2_sched1_not_svdend 2586/3626 Test #2787: mpi_dst_example_simple_lap_c_facto2_sched1_kway_svdbegin ................***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2787: mpi_dst_example_simple_lap_c_facto2_sched1_kway_svdbegin 2586/3626 Test #2789: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_svdbegin .....***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2789: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_svdbegin 2586/3626 Test #2791: mpi_dst_example_simple_lap_c_facto2_sched1_not_pqrcpbegin ...............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2791: mpi_dst_example_simple_lap_c_facto2_sched1_not_pqrcpbegin 2586/3626 Test #2792: mpi_dst_example_simple_lap_c_facto2_sched1_not_pqrcpend .................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2792: mpi_dst_example_simple_lap_c_facto2_sched1_not_pqrcpend 2586/3626 Test #2793: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpbegin ..............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2793: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpbegin 2586/3626 Test #2794: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpend ................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2794: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpend 2586/3626 Test #2795: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpbegin ...***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2795: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpbegin 2586/3626 Test #2796: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpend .....***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2796: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpend 2586/3626 Test #2797: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrcpbegin ...............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2797: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrcpbegin 2586/3626 Test #2798: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrcpend .................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2798: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrcpend 2586/3626 Test #2799: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrcpbegin ..............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2799: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrcpbegin 2586/3626 Test #2800: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrcpend ................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2800: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrcpend 2586/3626 Test #2801: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpbegin ...***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2801: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpbegin 2586/3626 Test #2802: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpend .....***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2802: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpend 2586/3626 Test #2803: mpi_dst_example_simple_lap_c_facto2_sched1_not_tqrcpbegin ...............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2803: mpi_dst_example_simple_lap_c_facto2_sched1_not_tqrcpbegin 2586/3626 Test #2804: mpi_dst_example_simple_lap_c_facto2_sched1_not_tqrcpend .................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.752082e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.325153e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.032600e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.208981e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.900627e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.288597e-01 s Time to initialize coeftab 7.537097e-01 s Time to factorize 4.489020e+00 s ( 8.90 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2804: mpi_dst_example_simple_lap_c_facto2_sched1_not_tqrcpend 2586/3626 Test #2805: mpi_dst_example_simple_lap_c_facto2_sched1_kway_tqrcpbegin ..............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2805: mpi_dst_example_simple_lap_c_facto2_sched1_kway_tqrcpbegin 2586/3626 Test #2806: mpi_dst_example_simple_lap_c_facto2_sched1_kway_tqrcpend ................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2806: mpi_dst_example_simple_lap_c_facto2_sched1_kway_tqrcpend 2586/3626 Test #2807: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpbegin ...***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2807: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpbegin 2586/3626 Test #2808: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpend .....***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2808: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpend 2586/3626 Test #2809: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrrtbegin ...............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2809: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrrtbegin 2586/3626 Test #2810: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrrtend .................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2810: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrrtend 2586/3626 Test #2811: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrrtbegin ..............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2811: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrrtbegin 2586/3626 Test #2812: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrrtend ................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.672148e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.262769e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.045781e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.602507e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.874001e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.171546e-01 s Time to initialize coeftab 4.112344e-01 s Time to factorize 4.310152e+00 s ( 9.27 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2812: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrrtend 2586/3626 Test #2813: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtbegin ...***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.131121e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.034528e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.848642e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.791250e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.419954e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.496358e-02 s Time to initialize coeftab 1.173025e+00 s Time to factorize 2.695226e+00 s (14.83 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko Start 2813: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtbegin 2586/3626 Test #2814: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtend .....***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2814: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtend 2586/3626 Test #2815: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpilu0 ...............***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.749816e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.368572e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.750958e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.969672e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.025790e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.714496e-01 s Time to initialize coeftab 1.042787e+00 s Time to factorize 2.205700e+00 s (18.12 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Start 2815: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpilu0 2586/3626 Test #2816: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpilu1 ...............***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2816: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpilu1 2586/3626 Test #2817: mpi_dst_example_simple_lap_c_facto3_sched1_not_svdbegin .................***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2817: mpi_dst_example_simple_lap_c_facto3_sched1_not_svdbegin 2586/3626 Test #2818: mpi_dst_example_simple_lap_c_facto3_sched1_not_svdend ...................***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2818: mpi_dst_example_simple_lap_c_facto3_sched1_not_svdend 2586/3626 Test #2819: mpi_dst_example_simple_lap_c_facto3_sched1_kway_svdbegin ................***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2819: mpi_dst_example_simple_lap_c_facto3_sched1_kway_svdbegin 2586/3626 Test #2821: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_svdbegin .....***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.470893e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.169274e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.596935e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.040585e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.639686e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.838784e-01 s Time to initialize coeftab 7.964458e-01 s Start 2821: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_svdbegin 2586/3626 Test #2823: mpi_dst_example_simple_lap_c_facto3_sched1_not_pqrcpbegin ...............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2823: mpi_dst_example_simple_lap_c_facto3_sched1_not_pqrcpbegin 2586/3626 Test #2826: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpend ................***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2826: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpend 2586/3626 Test #2827: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpbegin ...***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.377046e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.573331e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.698801e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.474431e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.603651e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.279543e-01 s Time to initialize coeftab 4.969607e-01 s Start 2827: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpbegin 2586/3626 Test #2828: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpend .....***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.699666e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.609302e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.277076e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.511443e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.032429e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.549019e-01 s Time to initialize coeftab 5.079114e-01 s Time to factorize 5.055308e+00 s ( 4.01 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 2828: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpend 2586/3626 Test #2829: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrcpbegin ...............***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2829: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrcpbegin 2586/3626 Test #2830: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrcpend .................***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2830: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrcpend 2586/3626 Test #2831: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrcpbegin ..............***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.107837e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.789397e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.547932e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.333777e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.413599e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.278193e-01 s Time to initialize coeftab 8.810756e-01 s Start 2831: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrcpbegin 2586/3626 Test #2832: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrcpend ................***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2832: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrcpend 2586/3626 Test #2833: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpbegin ...***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2833: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpbegin 2586/3626 Test #2834: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpend .....***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.917406e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.699923e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.096326e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.452404e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.104364e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.886736e-01 s Time to initialize coeftab 6.654718e-01 s Start 2834: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpend 2586/3626 Test #2835: mpi_dst_example_simple_lap_c_facto3_sched1_not_tqrcpbegin ...............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2835: mpi_dst_example_simple_lap_c_facto3_sched1_not_tqrcpbegin 2586/3626 Test #2836: mpi_dst_example_simple_lap_c_facto3_sched1_not_tqrcpend .................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2836: mpi_dst_example_simple_lap_c_facto3_sched1_not_tqrcpend 2586/3626 Test #2837: mpi_dst_example_simple_lap_c_facto3_sched1_kway_tqrcpbegin ..............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2837: mpi_dst_example_simple_lap_c_facto3_sched1_kway_tqrcpbegin 2586/3626 Test #2838: mpi_dst_example_simple_lap_c_facto3_sched1_kway_tqrcpend ................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2838: mpi_dst_example_simple_lap_c_facto3_sched1_kway_tqrcpend 2586/3626 Test #2839: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpbegin ...***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2839: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpbegin 2586/3626 Test #2841: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrrtbegin ...............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2841: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrrtbegin 2586/3626 Test #2842: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrrtend .................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2842: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrrtend 2586/3626 Test #2843: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrrtbegin ..............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2843: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrrtbegin 2586/3626 Test #2844: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrrtend ................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2844: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrrtend 2586/3626 Test #2845: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtbegin ...***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2845: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtbegin 2586/3626 Test #2846: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtend .....***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2846: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtend 2586/3626 Test #2847: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpilu0 ...............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.838114e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.764467e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.311429e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.889345e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.100975e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.185026e-01 s Time to initialize coeftab 7.773203e-01 s Start 2847: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpilu0 2586/3626 Test #2848: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpilu1 ...............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2848: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpilu1 2586/3626 Test #2849: mpi_dst_example_simple_lap_c_facto4_sched1_not_svdbegin .................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2849: mpi_dst_example_simple_lap_c_facto4_sched1_not_svdbegin 2586/3626 Test #2850: mpi_dst_example_simple_lap_c_facto4_sched1_not_svdend ...................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2850: mpi_dst_example_simple_lap_c_facto4_sched1_not_svdend 2586/3626 Test #2851: mpi_dst_example_simple_lap_c_facto4_sched1_kway_svdbegin ................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2851: mpi_dst_example_simple_lap_c_facto4_sched1_kway_svdbegin 2586/3626 Test #2853: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_svdbegin .....***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2853: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_svdbegin 2586/3626 Test #2854: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_svdend .......***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2854: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_svdend 2586/3626 Test #2855: mpi_dst_example_simple_lap_c_facto4_sched1_not_pqrcpbegin ...............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2855: mpi_dst_example_simple_lap_c_facto4_sched1_not_pqrcpbegin 2586/3626 Test #2856: mpi_dst_example_simple_lap_c_facto4_sched1_not_pqrcpend .................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2856: mpi_dst_example_simple_lap_c_facto4_sched1_not_pqrcpend 2586/3626 Test #2858: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpend ................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2858: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpend 2586/3626 Test #2860: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpend .....***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2860: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpend 2586/3626 Test #2861: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrcpbegin ...............***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2861: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrcpbegin 2586/3626 Test #2862: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrcpend .................***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2862: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrcpend 2586/3626 Test #2863: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrcpbegin ..............***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2863: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrcpbegin 2586/3626 Test #2864: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrcpend ................***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2864: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrcpend 2586/3626 Test #2865: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpbegin ...***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2865: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpbegin 2586/3626 Test #2866: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpend .....***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.576573e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.784529e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.590238e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.315715e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.730095e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.221409e-01 s Time to initialize coeftab 4.268963e-01 s Time to factorize 1.123760e+01 s ( 1.90 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2866: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpend 2586/3626 Test #2867: mpi_dst_example_simple_lap_c_facto4_sched1_not_tqrcpbegin ...............***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2867: mpi_dst_example_simple_lap_c_facto4_sched1_not_tqrcpbegin 2586/3626 Test #2868: mpi_dst_example_simple_lap_c_facto4_sched1_not_tqrcpend .................***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2868: mpi_dst_example_simple_lap_c_facto4_sched1_not_tqrcpend 2586/3626 Test #2869: mpi_dst_example_simple_lap_c_facto4_sched1_kway_tqrcpbegin ..............***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2869: mpi_dst_example_simple_lap_c_facto4_sched1_kway_tqrcpbegin 2586/3626 Test #2870: mpi_dst_example_simple_lap_c_facto4_sched1_kway_tqrcpend ................***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.730494e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.373875e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.962952e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.099351e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.973395e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.567527e-01 s Time to initialize coeftab 6.971122e-01 s Time to factorize 1.064301e+01 s ( 2.00 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2870: mpi_dst_example_simple_lap_c_facto4_sched1_kway_tqrcpend 2586/3626 Test #2871: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpbegin ...***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2871: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpbegin 2586/3626 Test #2872: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpend .....***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2872: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpend 2586/3626 Test #2873: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrrtbegin ...............***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2873: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrrtbegin 2586/3626 Test #2874: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrrtend .................***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2874: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrrtend 2586/3626 Test #2875: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrrtbegin ..............***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2875: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrrtbegin 2586/3626 Test #2876: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrrtend ................***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2876: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrrtend 2586/3626 Test #2877: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtbegin ...***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2877: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtbegin 2586/3626 Test #2878: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtend .....***Timeout 436.40 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2878: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtend 2586/3626 Test #2879: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpilu0 ...............***Timeout 436.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2879: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpilu0 2586/3626 Test #2880: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpilu1 ...............***Timeout 436.40 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2880: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpilu1 2586/3626 Test #2881: mpi_dst_example_simple_lap_z_facto0_sched1_not_svdbegin .................***Timeout 436.40 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2881: mpi_dst_example_simple_lap_z_facto0_sched1_not_svdbegin 2586/3626 Test #2882: mpi_dst_example_simple_lap_z_facto0_sched1_not_svdend ...................***Timeout 436.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2882: mpi_dst_example_simple_lap_z_facto0_sched1_not_svdend 2586/3626 Test #2883: mpi_dst_example_simple_lap_z_facto0_sched1_kway_svdbegin ................***Timeout 436.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.558395e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.253706e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.097569e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.688586e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.811021e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.444563e-01 s Time to initialize coeftab 4.147452e-01 s Start 2883: mpi_dst_example_simple_lap_z_facto0_sched1_kway_svdbegin 2586/3626 Test #2884: mpi_dst_example_simple_lap_z_facto0_sched1_kway_svdend ..................***Timeout 436.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2884: mpi_dst_example_simple_lap_z_facto0_sched1_kway_svdend 2586/3626 Test #2885: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_svdbegin .....***Timeout 436.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2885: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_svdbegin 2586/3626 Test #2886: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_svdend .......***Timeout 436.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2886: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_svdend 2586/3626 Test #2887: mpi_dst_example_simple_lap_z_facto0_sched1_not_pqrcpbegin ...............***Timeout 436.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2887: mpi_dst_example_simple_lap_z_facto0_sched1_not_pqrcpbegin 2586/3626 Test #2888: mpi_dst_example_simple_lap_z_facto0_sched1_not_pqrcpend .................***Timeout 436.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2888: mpi_dst_example_simple_lap_z_facto0_sched1_not_pqrcpend 2586/3626 Test #2889: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpbegin ..............***Timeout 436.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2889: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpbegin 2586/3626 Test #2890: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpend ................***Timeout 436.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2890: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpend 2586/3626 Test #2891: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpbegin ...***Timeout 436.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.564033e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.768807e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.285751e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.603533e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.750229e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.461162e-01 s Time to initialize coeftab 2.663786e-01 s Start 2891: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpbegin 2586/3626 Test #2892: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpend .....***Timeout 436.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2892: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpend 2586/3626 Test #2893: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrcpbegin ...............***Timeout 436.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2893: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrcpbegin 2586/3626 Test #2894: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrcpend .................***Timeout 436.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2894: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrcpend 2586/3626 Test #2895: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrcpbegin ..............***Timeout 436.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2895: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrcpbegin 2586/3626 Test #2896: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrcpend ................***Timeout 436.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2896: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrcpend 2586/3626 Test #2897: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpbegin ...***Timeout 436.45 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2897: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpbegin 2586/3626 Test #2898: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpend .....***Timeout 436.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2898: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpend 2586/3626 Test #2899: mpi_dst_example_simple_lap_z_facto0_sched1_not_tqrcpbegin ...............***Timeout 436.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2899: mpi_dst_example_simple_lap_z_facto0_sched1_not_tqrcpbegin 2586/3626 Test #2900: mpi_dst_example_simple_lap_z_facto0_sched1_not_tqrcpend .................***Timeout 436.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2900: mpi_dst_example_simple_lap_z_facto0_sched1_not_tqrcpend 2586/3626 Test #2901: mpi_dst_example_simple_lap_z_facto0_sched1_kway_tqrcpbegin ..............***Timeout 436.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2901: mpi_dst_example_simple_lap_z_facto0_sched1_kway_tqrcpbegin 2586/3626 Test #2902: mpi_dst_example_simple_lap_z_facto0_sched1_kway_tqrcpend ................***Timeout 436.47 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2902: mpi_dst_example_simple_lap_z_facto0_sched1_kway_tqrcpend 2586/3626 Test #2903: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpbegin ...***Timeout 436.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.718785e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.629565e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.520307e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.488400e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.971666e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.599302e-01 s Time to initialize coeftab 1.255491e+00 s Start 2903: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpbegin Test #2530: mpi_dst_example_simple_lap_s_facto0_sched1_not_svdbegin .................***Timeout 436.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.282132e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.652881e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.410389e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.265245e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.561088e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.719355e-03 s Time to initialize coeftab 1.030158e+00 s Test #2550: mpi_dst_example_simple_lap_s_facto0_sched1_kway_tqrcpbegin ..............***Timeout 436.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2554: mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrrtbegin ...............***Timeout 436.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2561: mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpilu1 ...............***Timeout 436.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2564: mpi_dst_example_simple_lap_s_facto1_sched1_kway_svdbegin ................***Timeout 436.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2565: mpi_dst_example_simple_lap_s_facto1_sched1_kway_svdend ..................***Timeout 436.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2567: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_svdend .......***Timeout 436.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2568: mpi_dst_example_simple_lap_s_facto1_sched1_not_pqrcpbegin ...............***Timeout 436.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.600312e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.221956e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.064979e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.394246e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.879552e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.782679e-02 s Test #2571: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpend ................***Timeout 436.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2572: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpbegin ...***Timeout 436.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2574: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrcpbegin ...............***Timeout 436.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2577: mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrcpend ................***Timeout 436.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2615: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpbegin ...***Timeout 436.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2615: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpbegin Test #2617: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrrtbegin ...............***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2617: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrrtbegin Test #2619: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrrtbegin ..............***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.385177e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.459955e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.072983e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.074162e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.782995e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.734738e-01 s Time to initialize coeftab 4.722442e-01 s Time to factorize 5.646867e+00 s ( 1.77 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Start 2619: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrrtbegin Test #2621: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtbegin ...***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2621: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtbegin Test #2625: mpi_dst_example_simple_lap_d_facto0_sched1_not_svdbegin .................***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2625: mpi_dst_example_simple_lap_d_facto0_sched1_not_svdbegin Test #2626: mpi_dst_example_simple_lap_d_facto0_sched1_not_svdend ...................***Timeout 436.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2626: mpi_dst_example_simple_lap_d_facto0_sched1_not_svdend Test #2627: mpi_dst_example_simple_lap_d_facto0_sched1_kway_svdbegin ................***Timeout 436.36 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.709029e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.340128e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.763204e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.193417e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.982778e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.940116e-01 s Time to initialize coeftab 6.629890e-01 s Start 2627: mpi_dst_example_simple_lap_d_facto0_sched1_kway_svdbegin Test #2628: mpi_dst_example_simple_lap_d_facto0_sched1_kway_svdend ..................***Timeout 436.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2628: mpi_dst_example_simple_lap_d_facto0_sched1_kway_svdend Test #2629: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_svdbegin .....***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2629: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_svdbegin Test #2630: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_svdend .......***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2630: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_svdend Test #2639: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrcpbegin ..............***Timeout 436.36 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2639: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrcpbegin Test #2640: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrcpend ................***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.914977e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.711635e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.504963e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.444127e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.110511e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.970328e-01 s Time to initialize coeftab 9.381107e-01 s Time to factorize 3.228506e+00 s ( 1.57 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2640: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrcpend Test #2648: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpend .....***Timeout 436.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2648: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpend Test #2651: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrrtbegin ..............***Timeout 436.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2651: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrrtbegin Test #2652: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrrtend ................***Timeout 436.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.755692e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.950758e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.828785e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.215601e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.103521e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.756552e-01 s Time to initialize coeftab 7.315327e-01 s Time to factorize 3.757894e+00 s ( 1.35 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 2652: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrrtend Test #2653: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtbegin ...***Timeout 436.36 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.321360e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.994866e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.820986e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.448062e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.583261e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.586474e-02 s Time to initialize coeftab 1.187267e+00 s Start 2653: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtbegin Test #2656: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpilu1 ...............***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.520261e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.725645e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.964116e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.535198e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.717521e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.199703e-02 s Time to initialize coeftab 4.068358e-01 s Time to factorize 6.751748e+00 s (767.77 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 2656: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpilu1 Test #2657: mpi_dst_example_simple_lap_d_facto1_sched1_not_svdbegin .................***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.329558e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.282483e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.959747e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 2657: mpi_dst_example_simple_lap_d_facto1_sched1_not_svdbegin Test #2661: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_svdbegin .....***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2661: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_svdbegin Test #2664: mpi_dst_example_simple_lap_d_facto1_sched1_not_pqrcpend .................***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2664: mpi_dst_example_simple_lap_d_facto1_sched1_not_pqrcpend Test #2665: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpbegin ..............***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.800531e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.827777e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.487989e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.244062e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.999290e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.364273e-02 s Time to initialize coeftab 9.236871e-01 s Time to factorize 3.732205e+00 s ( 1.40 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 2665: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpbegin Test #2666: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpend ................***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2666: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpend Test #2668: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpend .....***Timeout 436.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2668: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpend Test #2672: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrcpend ................***Timeout 436.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2672: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrcpend Test #2673: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpbegin ...***Timeout 436.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.366271e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.799280e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.358838e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.868031e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.707671e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.887227e-01 s Time to initialize coeftab 5.009900e-01 s Time to factorize 4.390991e+00 s ( 1.19 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2673: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpbegin Test #2675: mpi_dst_example_simple_lap_d_facto1_sched1_not_tqrcpbegin ...............***Timeout 436.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2675: mpi_dst_example_simple_lap_d_facto1_sched1_not_tqrcpbegin Test #2676: mpi_dst_example_simple_lap_d_facto1_sched1_not_tqrcpend .................***Timeout 436.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2676: mpi_dst_example_simple_lap_d_facto1_sched1_not_tqrcpend Test #2677: mpi_dst_example_simple_lap_d_facto1_sched1_kway_tqrcpbegin ..............***Timeout 436.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2677: mpi_dst_example_simple_lap_d_facto1_sched1_kway_tqrcpbegin Test #2679: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpbegin ...***Timeout 436.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2679: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpbegin Test #2680: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpend .....***Timeout 436.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2680: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpend Test #2681: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrrtbegin ...............***Timeout 436.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2681: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrrtbegin Test #2682: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrrtend .................***Timeout 436.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2682: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrrtend Test #2690: mpi_dst_example_simple_lap_d_facto2_sched1_not_svdend ...................***Timeout 436.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2690: mpi_dst_example_simple_lap_d_facto2_sched1_not_svdend Test #2692: mpi_dst_example_simple_lap_d_facto2_sched1_kway_svdend ..................***Timeout 436.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2692: mpi_dst_example_simple_lap_d_facto2_sched1_kway_svdend Test #2693: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_svdbegin .....***Timeout 436.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2693: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_svdbegin Test #2694: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_svdend .......***Timeout 436.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2694: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_svdend Test #2698: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpend ................***Timeout 436.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2698: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpend Test #2699: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpbegin ...***Timeout 436.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2699: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpbegin Test #2701: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrcpbegin ...............***Timeout 436.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2701: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrcpbegin Test #2702: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrcpend .................***Timeout 436.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL ischedInit: The thread number has been automatically set to 256 GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2702: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrcpend Test #2582: mpi_dst_example_simple_lap_s_facto1_sched1_kway_tqrcpbegin ..............***Timeout 436.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2585: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpend .....***Timeout 436.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2586: mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrrtbegin ...............***Timeout 436.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2590: mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtbegin ...***Timeout 436.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2591: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpilu0 ...............***Timeout 436.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2592: mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpilu1 ...............***Timeout 436.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2593: mpi_dst_example_simple_lap_s_facto2_sched1_not_svdbegin .................***Timeout 436.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2595: mpi_dst_example_simple_lap_s_facto2_sched1_kway_svdbegin ................***Timeout 436.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2596: mpi_dst_example_simple_lap_s_facto2_sched1_kway_svdend ..................***Timeout 436.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2597: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_svdbegin .....***Timeout 436.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2601: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpbegin ..............***Timeout 436.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2606: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrcpend .................***Timeout 436.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2607: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrcpbegin ..............***Timeout 436.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.605541e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.095716e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.395501e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.599187e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.709968e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.465633e-01 s Time to initialize coeftab 4.405706e-01 s Time to factorize 5.992274e+00 s ( 1.67 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko Test #2608: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrcpend ................***Timeout 436.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2609: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpbegin ...***Timeout 436.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2610: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpend .....***Timeout 436.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2704: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrcpend ................***Timeout 436.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2704: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrcpend Test #2705: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpbegin ...***Timeout 436.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2705: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpbegin Test #2706: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpend .....***Timeout 436.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2706: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpend Test #2707: mpi_dst_example_simple_lap_d_facto2_sched1_not_tqrcpbegin ...............***Timeout 436.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2707: mpi_dst_example_simple_lap_d_facto2_sched1_not_tqrcpbegin Test #2711: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpbegin ...***Timeout 436.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2711: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpbegin Test #2713: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrrtbegin ...............***Timeout 436.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2713: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrrtbegin 2614/3626 Test #2904: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpend .....***Timeout 314.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.340575e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.875906e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.251762e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.305848e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.347990e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.876129e-01 s Time to initialize coeftab 2.576309e-01 s Time to factorize 6.959415e+00 s ( 2.91 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 7.843752e-01 s - iteration 1 : total iteration time 1.17 s error 3.8624e-16 Time for refinement 2.243790e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.052634e-16 max(|| b_i - A x_i ||_1) 8.548315e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.157030e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.052634e-16 max(|| b_i - A x_i ||_1) 8.548315e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.157030e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.052634e-16 max(|| b_i - A x_i ||_1) 8.548315e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.157030e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.052634e-16 max(|| b_i - A x_i ||_1) 8.548315e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.157030e-03 (SUCCESS) Start 2904: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpend 2614/3626 Test #2905: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrrtbegin ...............***Timeout 291.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.323180e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.034602e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.142482e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.226031e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.329031e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.036462e-01 s Time to initialize coeftab 1.509761e+00 s Time to factorize 1.527641e+01 s ( 1.33 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.011661e-01 s - iteration 1 : total iteration time 0.27 s error 3.3016e-13 Time for refinement 5.842426e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.301590e-13 max(|| b_i - A x_i ||_1) 6.128562e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.546444e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.301590e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.301590e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.301590e-13 max(|| b_i - A x_i ||_1) 6.128562e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.546444e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 6.128562e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.546444e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 6.128562e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.546444e+00 (SUCCESS) Start 2905: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrrtbegin 2614/3626 Test #2906: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrrtend .................***Timeout 290.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 2: 200 760 1: 300 1140 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.383034e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.886374e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.082236e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.506968e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.389494e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.510803e-02 s Time to initialize coeftab 2.293713e-01 s Time to factorize 1.134155e+01 s ( 1.79 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.128543e-01 s - iteration 1 : total iteration time 0.717 s error 1.6568e-14 Time for refinement 1.528999e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.656773e-14 max(|| b_i - A x_i ||_1) 1.155468e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.915638e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.656773e-14 max(|| b_i - A x_i ||_1) 1.155468e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.915638e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.656773e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.656773e-14 max(|| b_i - A x_i ||_1) 1.155468e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.915638e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.155468e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.915638e-02 (SUCCESS) Start 2906: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrrtend 2614/3626 Test #2907: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrrtbegin ..............***Timeout 272.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.412436e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.006890e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.361800e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.447469e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.659256e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.084427e-02 s Time to initialize coeftab 1.084626e+00 s Time to factorize 1.778759e+01 s ( 1.14 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.486171e+01 s - iteration 1 : total iteration time 44.6 s error 3.0008e-13 Time for refinement 1.001724e+02 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.000784e-13 max(|| b_i - A x_i ||_1) 5.333655e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.345862e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.000784e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.000784e-13 max(|| b_i - A x_i ||_1) 5.333655e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.345862e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 5.333655e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.345862e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.000784e-13 max(|| b_i - A x_i ||_1) 5.333655e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.345862e+00 (SUCCESS) Start 2907: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrrtbegin 2614/3626 Test #2908: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrrtend ................***Timeout 271.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.034335e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.636591e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.726776e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.549797e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.163183e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.017888e-02 s Time to initialize coeftab 1.111219e+01 s Time to factorize 5.121301e+01 s (405.51 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 9.380649e+00 s - iteration 1 : total iteration time 5.2 s error 1.494e-14 Time for refinement 1.796107e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.494252e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.494252e-14 max(|| b_i - A x_i ||_1) 1.362627e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.438371e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.494252e-14 max(|| b_i - A x_i ||_1) 1.362627e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.438371e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.494252e-14 max(|| b_i - A x_i ||_1) 1.362627e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.438371e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.362627e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.438371e-02 (SUCCESS) Start 2908: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrrtend 2614/3626 Test #2909: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtbegin ...***Timeout 271.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.304668e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.466051e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.915665e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.447936e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.313551e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.169404e-02 s Time to initialize coeftab 1.484180e+00 s Time to factorize 1.554991e+01 s ( 1.30 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 8.512575e-02 s - iteration 1 : total iteration time 0.214 s error 3.3552e-13 Time for refinement 5.944417e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.355226e-13 max(|| b_i - A x_i ||_1) 6.072252e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.532235e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.355226e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.355226e-13 max(|| b_i - A x_i ||_1) 6.072252e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.532235e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.355226e-13 max(|| b_i - A x_i ||_1) 6.072252e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.532235e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 6.072252e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.532235e+00 (SUCCESS) Start 2909: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtbegin 2614/3626 Test #2910: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtend .....***Timeout 269.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.372589e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.481758e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.591636e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.117834e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.383813e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.025375e-01 s Time to initialize coeftab 1.822526e-01 s Time to factorize 5.332554e+00 s ( 3.80 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 8.319701e-01 s - iteration 1 : total iteration time 0.948 s error 1.6249e-14 Time for refinement 2.105733e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.624260e-14 max(|| b_i - A x_i ||_1) 1.780470e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.492730e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.624260e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.624260e-14 max(|| b_i - A x_i ||_1) 1.780470e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.492730e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.624260e-14 max(|| b_i - A x_i ||_1) 1.780470e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.492730e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.780470e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.492730e-02 (SUCCESS) Start 2910: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtend 2614/3626 Test #2911: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpilu0 ...............***Timeout 269.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.366835e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.964895e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.230048e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.578052e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.384546e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.259155e-02 s Time to initialize coeftab 1.041454e-01 s Time to factorize 1.808530e+01 s ( 1.12 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.083604e-01 s - iteration 1 : total iteration time 0.417 s error 6.1817e-15 Time for refinement 1.147057e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.180702e-15 max(|| b_i - A x_i ||_1) 9.859026e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.487766e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.180702e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.180702e-15 max(|| b_i - A x_i ||_1) 9.859026e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.487766e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.180702e-15 max(|| b_i - A x_i ||_1) 9.859026e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.487766e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 9.859026e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.487766e-02 (SUCCESS) Start 2911: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpilu0 2614/3626 Test #2912: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpilu1 ...............***Timeout 268.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.363622e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.662215e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.091129e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.798389e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.375088e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.517585e-02 s Time to initialize coeftab 1.773034e-01 s Time to factorize 1.232527e+01 s ( 1.65 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 2.817264e-01 s - iteration 1 : total iteration time 0.531 s error 6.912e-15 Time for refinement 1.414303e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.913877e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.913877e-15 max(|| b_i - A x_i ||_1) 1.158043e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.922135e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.913877e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.913877e-15 max(|| b_i - A x_i ||_1) 1.158043e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.922135e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.158043e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.922135e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.158043e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.922135e-02 (SUCCESS) Start 2912: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpilu1 2614/3626 Test #2914: mpi_dst_example_simple_lap_z_facto1_sched1_not_svdend ...................***Timeout 256.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.468619e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.296409e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.227856e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.321980e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.844224e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.885413e-02 s Time to initialize coeftab 5.956924e+00 s Time to factorize 5.452810e+01 s (400.15 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.962318e+01 s - iteration 1 : total iteration time 11.8 s error 3.3824e-15 Time for refinement 2.827059e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.383900e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.383900e-15 max(|| b_i - A x_i ||_1) 2.980533e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.520895e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.383900e-15 max(|| b_i - A x_i ||_1) 2.980533e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.520895e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.980533e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.520895e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.383900e-15 max(|| b_i - A x_i ||_1) 2.980533e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.520895e-03 (SUCCESS) Start 2914: mpi_dst_example_simple_lap_z_facto1_sched1_not_svdend 2614/3626 Test #2915: mpi_dst_example_simple_lap_z_facto1_sched1_kway_svdbegin ................***Timeout 255.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.532467e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.672273e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.464689e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.150086e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.698194e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.203723e-01 s Time to initialize coeftab 1.349563e+00 s Time to factorize 1.466319e+01 s ( 1.45 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 2.585715e+01 s - iteration 1 : total iteration time 35.7 s error 2.2262e-14 Time for refinement 9.268904e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.226764e-14 max(|| b_i - A x_i ||_1) 3.829649e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.663504e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.226764e-14 max(|| b_i - A x_i ||_1) 3.829649e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.663504e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.226764e-14 max(|| b_i - A x_i ||_1) 3.829649e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.663504e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.226764e-14 max(|| b_i - A x_i ||_1) 3.829649e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.663504e-02 (SUCCESS) Start 2915: mpi_dst_example_simple_lap_z_facto1_sched1_kway_svdbegin 2614/3626 Test #2916: mpi_dst_example_simple_lap_z_facto1_sched1_kway_svdend ..................***Timeout 254.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.371076e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.918213e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.362661e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.024157e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.385221e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.359781e-02 s Time to initialize coeftab 2.388371e-01 s Time to factorize 1.559186e+01 s ( 1.37 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 2.861515e-01 s - iteration 1 : total iteration time 0.534 s error 3.056e-16 Time for refinement 9.626212e-01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.275756e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.275756e-16 max(|| b_i - A x_i ||_1) 8.591995e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.168052e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.275756e-16 max(|| b_i - A x_i ||_1) 8.591995e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.168052e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.591995e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.168052e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.275756e-16 max(|| b_i - A x_i ||_1) 8.591995e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.168052e-03 (SUCCESS) Start 2916: mpi_dst_example_simple_lap_z_facto1_sched1_kway_svdend Start 3063: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_tqrcpbegin Start 3064: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_tqrcpend Start 3065: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrrtbegin Start 3066: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrrtend Start 3067: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrrtbegin Start 3068: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrrtend Start 3069: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtbegin Start 3070: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtend Start 3071: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpilu0 Start 3072: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpilu1 Start 3073: mpi_dst_example_simple_lap_s_facto1_sched4_not_svdbegin Start 3074: mpi_dst_example_simple_lap_s_facto1_sched4_not_svdend Start 3075: mpi_dst_example_simple_lap_s_facto1_sched4_kway_svdbegin Start 3076: mpi_dst_example_simple_lap_s_facto1_sched4_kway_svdend Start 3077: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_svdbegin Start 3078: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_svdend Start 3079: mpi_dst_example_simple_lap_s_facto1_sched4_not_pqrcpbegin Start 3080: mpi_dst_example_simple_lap_s_facto1_sched4_not_pqrcpend Start 3081: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpbegin Start 3082: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpend Start 3083: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpbegin Start 3084: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpend Start 3085: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrcpbegin Start 3086: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrcpend Start 3087: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrcpbegin Start 3088: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrcpend Start 3089: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpbegin Start 3090: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpend Start 3091: mpi_dst_example_simple_lap_s_facto1_sched4_not_tqrcpbegin Start 3092: mpi_dst_example_simple_lap_s_facto1_sched4_not_tqrcpend Start 3093: mpi_dst_example_simple_lap_s_facto1_sched4_kway_tqrcpbegin Start 3094: mpi_dst_example_simple_lap_s_facto1_sched4_kway_tqrcpend Start 3095: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpbegin Start 3096: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpend Start 3097: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrrtbegin Start 3098: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrrtend Start 3099: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrrtbegin Start 3100: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrrtend Start 3101: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtbegin Start 3102: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtend Start 3103: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpilu0 Start 3104: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpilu1 Start 3105: mpi_dst_example_simple_lap_s_facto2_sched4_not_svdbegin Start 3106: mpi_dst_example_simple_lap_s_facto2_sched4_not_svdend Start 3107: mpi_dst_example_simple_lap_s_facto2_sched4_kway_svdbegin Start 3108: mpi_dst_example_simple_lap_s_facto2_sched4_kway_svdend Start 3109: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_svdbegin Start 3110: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_svdend Start 3111: mpi_dst_example_simple_lap_s_facto2_sched4_not_pqrcpbegin Start 3112: mpi_dst_example_simple_lap_s_facto2_sched4_not_pqrcpend Start 3113: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpbegin Start 3114: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpend Start 3115: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpbegin Start 3116: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpend Start 3117: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrcpbegin Start 3118: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrcpend Start 3119: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrcpbegin Start 3120: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrcpend Start 3121: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpbegin Start 3122: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpend Start 3123: mpi_dst_example_simple_lap_s_facto2_sched4_not_tqrcpbegin Start 3124: mpi_dst_example_simple_lap_s_facto2_sched4_not_tqrcpend Start 3125: mpi_dst_example_simple_lap_s_facto2_sched4_kway_tqrcpbegin Test #2613: mpi_dst_example_simple_lap_s_facto2_sched1_kway_tqrcpbegin .............. Passed 129.01 sec Test #2618: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrrtend ................. Passed 128.09 sec Test #2623: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpilu0 ............... Passed 126.15 sec Test #2624: mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpilu1 ............... Passed 126.14 sec Test #2632: mpi_dst_example_simple_lap_d_facto0_sched1_not_pqrcpend ................. Passed 126.14 sec Test #2634: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpend ................ Passed 125.09 sec Test #2638: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrcpend ................. Passed 122.98 sec Test #2643: mpi_dst_example_simple_lap_d_facto0_sched1_not_tqrcpbegin ............... Passed 121.94 sec Test #2644: mpi_dst_example_simple_lap_d_facto0_sched1_not_tqrcpend ................. Passed 121.93 sec Test #2645: mpi_dst_example_simple_lap_d_facto0_sched1_kway_tqrcpbegin .............. Passed 121.92 sec Test #2650: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrrtend ................. Passed 119.81 sec Test #2658: mpi_dst_example_simple_lap_d_facto1_sched1_not_svdend ................... Passed 118.76 sec Test #2663: mpi_dst_example_simple_lap_d_facto1_sched1_not_pqrcpbegin ............... Passed 115.71 sec Test #2674: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpend ..... Passed 112.80 sec Test #2685: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtbegin ... Passed 110.88 sec Test #2687: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpilu0 ............... Passed 109.96 sec Test #2688: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpilu1 ............... Passed 109.96 sec Test #2691: mpi_dst_example_simple_lap_d_facto2_sched1_kway_svdbegin ................ Passed 109.05 sec Test #2695: mpi_dst_example_simple_lap_d_facto2_sched1_not_pqrcpbegin ............... Passed 109.05 sec Test #2715: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrrtbegin .............. Passed 93.76 sec Test #2751: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpilu0 ............... Passed 67.63 sec Test #2782: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrrtend ..... Passed 47.21 sec Test #2783: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpilu0 ............... Passed 47.20 sec Test #2788: mpi_dst_example_simple_lap_c_facto2_sched1_kway_svdend .................. Passed 44.92 sec Test #2790: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_svdend ....... Passed 44.27 sec Test #2820: mpi_dst_example_simple_lap_c_facto3_sched1_kway_svdend .................. Passed 29.43 sec Test #2822: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_svdend ....... Passed 29.31 sec Test #2824: mpi_dst_example_simple_lap_c_facto3_sched1_not_pqrcpend ................. Passed 28.78 sec Test #2825: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpbegin .............. Passed 28.77 sec Test #2840: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpend ..... Passed 24.35 sec 2644/3626 Test #2913: mpi_dst_example_simple_lap_z_facto1_sched1_not_svdbegin .................***Timeout 271.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.361464e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.456051e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.002306e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.993474e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.371246e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.440792e-01 s Time to initialize coeftab 1.654266e+00 s Start 2913: mpi_dst_example_simple_lap_z_facto1_sched1_not_svdbegin Start 3126: mpi_dst_example_simple_lap_s_facto2_sched4_kway_tqrcpend Start 3127: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpbegin Start 3128: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpend Start 3129: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrrtbegin Start 3130: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrrtend Start 3131: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrrtbegin Start 3132: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrrtend Start 3133: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtbegin Start 3134: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtend Start 3135: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpilu0 Start 3136: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpilu1 Start 3137: mpi_dst_example_simple_lap_d_facto0_sched4_not_svdbegin Start 3138: mpi_dst_example_simple_lap_d_facto0_sched4_not_svdend Start 3139: mpi_dst_example_simple_lap_d_facto0_sched4_kway_svdbegin Start 3140: mpi_dst_example_simple_lap_d_facto0_sched4_kway_svdend Start 3141: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_svdbegin Start 3142: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_svdend Start 3143: mpi_dst_example_simple_lap_d_facto0_sched4_not_pqrcpbegin Start 3144: mpi_dst_example_simple_lap_d_facto0_sched4_not_pqrcpend Start 3145: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpbegin Start 3146: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpend Start 3147: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpbegin Start 3148: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpend Start 3149: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrcpbegin Start 3150: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrcpend Start 3151: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrcpbegin Start 3152: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrcpend Start 3153: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpbegin Start 3154: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpend Start 3155: mpi_dst_example_simple_lap_d_facto0_sched4_not_tqrcpbegin Test #2852: mpi_dst_example_simple_lap_c_facto4_sched1_kway_svdend .................. Passed 21.72 sec Test #2857: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpbegin .............. Passed 20.50 sec Test #2859: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpbegin ... Passed 20.12 sec Start 3156: mpi_dst_example_simple_lap_d_facto0_sched4_not_tqrcpend Start 3157: mpi_dst_example_simple_lap_d_facto0_sched4_kway_tqrcpbegin Start 3158: mpi_dst_example_simple_lap_d_facto0_sched4_kway_tqrcpend Test #2616: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpend .....***Timeout 117.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2631: mpi_dst_example_simple_lap_d_facto0_sched1_not_pqrcpbegin ...............***Timeout 117.20 sec ischedInit: The thread number has been automatically set to 256 Test #2642: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpend .....***Timeout 117.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2669: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrcpbegin ...............***Timeout 116.86 sec ischedInit: The thread number has been automatically set to 256 Test #2683: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrrtbegin ..............***Timeout 116.66 sec 2652/3626 Test #2917: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_svdbegin .....***Timeout 116.18 sec Start 2917: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_svdbegin 2652/3626 Test #2918: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_svdend .......***Timeout 116.19 sec Start 2918: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_svdend 2652/3626 Test #2919: mpi_dst_example_simple_lap_z_facto1_sched1_not_pqrcpbegin ...............***Timeout 116.20 sec Start 2919: mpi_dst_example_simple_lap_z_facto1_sched1_not_pqrcpbegin 2652/3626 Test #2920: mpi_dst_example_simple_lap_z_facto1_sched1_not_pqrcpend .................***Timeout 116.21 sec Start 2920: mpi_dst_example_simple_lap_z_facto1_sched1_not_pqrcpend 2652/3626 Test #2921: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpbegin ..............***Timeout 116.22 sec Start 2921: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpbegin 2652/3626 Test #2922: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpend ................***Timeout 116.23 sec Start 2922: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpend 2652/3626 Test #2923: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpbegin ...***Timeout 116.24 sec Start 2923: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpbegin 2652/3626 Test #2924: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpend .....***Timeout 116.24 sec Start 2924: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpend 2652/3626 Test #2925: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrcpbegin ...............***Timeout 116.26 sec Start 2925: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrcpbegin 2652/3626 Test #2926: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrcpend .................***Timeout 116.32 sec Start 2926: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrcpend 2652/3626 Test #2927: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrcpbegin ..............***Timeout 116.36 sec Start 2927: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrcpbegin 2652/3626 Test #2928: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrcpend ................***Timeout 116.37 sec Start 2928: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrcpend 2652/3626 Test #2929: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpbegin ...***Timeout 116.39 sec Start 2929: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpbegin 2652/3626 Test #2930: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpend .....***Timeout 116.40 sec Start 2930: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpend 2652/3626 Test #2931: mpi_dst_example_simple_lap_z_facto1_sched1_not_tqrcpbegin ...............***Timeout 116.41 sec Start 2931: mpi_dst_example_simple_lap_z_facto1_sched1_not_tqrcpbegin 2652/3626 Test #2932: mpi_dst_example_simple_lap_z_facto1_sched1_not_tqrcpend .................***Timeout 116.52 sec Start 2932: mpi_dst_example_simple_lap_z_facto1_sched1_not_tqrcpend 2652/3626 Test #2933: mpi_dst_example_simple_lap_z_facto1_sched1_kway_tqrcpbegin ..............***Timeout 116.56 sec Start 2933: mpi_dst_example_simple_lap_z_facto1_sched1_kway_tqrcpbegin 2652/3626 Test #2934: mpi_dst_example_simple_lap_z_facto1_sched1_kway_tqrcpend ................***Timeout 116.63 sec Start 2934: mpi_dst_example_simple_lap_z_facto1_sched1_kway_tqrcpend 2652/3626 Test #2935: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpbegin ...***Timeout 117.17 sec Start 2935: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpbegin 2652/3626 Test #2936: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpend .....***Timeout 117.35 sec Start 2936: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpend 2652/3626 Test #2937: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrrtbegin ...............***Timeout 117.46 sec Start 2937: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrrtbegin 2652/3626 Test #2938: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrrtend .................***Timeout 117.59 sec Start 2938: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrrtend 2652/3626 Test #2939: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrrtbegin ..............***Timeout 117.67 sec Start 2939: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrrtbegin 2652/3626 Test #2940: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrrtend ................***Timeout 117.72 sec Start 2940: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrrtend 2652/3626 Test #2941: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtbegin ...***Timeout 117.84 sec Start 2941: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtbegin 2652/3626 Test #2942: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtend .....***Timeout 117.89 sec Start 2942: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtend 2652/3626 Test #2943: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpilu0 ...............***Timeout 117.90 sec Start 2943: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpilu0 2652/3626 Test #2944: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpilu1 ...............***Timeout 117.91 sec Start 2944: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpilu1 2652/3626 Test #2945: mpi_dst_example_simple_lap_z_facto2_sched1_not_svdbegin .................***Timeout 117.93 sec Start 2945: mpi_dst_example_simple_lap_z_facto2_sched1_not_svdbegin 2652/3626 Test #2946: mpi_dst_example_simple_lap_z_facto2_sched1_not_svdend ...................***Timeout 117.95 sec Start 2946: mpi_dst_example_simple_lap_z_facto2_sched1_not_svdend 2652/3626 Test #2947: mpi_dst_example_simple_lap_z_facto2_sched1_kway_svdbegin ................***Timeout 117.96 sec Start 2947: mpi_dst_example_simple_lap_z_facto2_sched1_kway_svdbegin 2652/3626 Test #2948: mpi_dst_example_simple_lap_z_facto2_sched1_kway_svdend ..................***Timeout 117.97 sec Start 2948: mpi_dst_example_simple_lap_z_facto2_sched1_kway_svdend 2652/3626 Test #2949: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_svdbegin .....***Timeout 118.02 sec Start 2949: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_svdbegin 2652/3626 Test #2950: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_svdend .......***Timeout 118.06 sec Start 2950: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_svdend 2652/3626 Test #2951: mpi_dst_example_simple_lap_z_facto2_sched1_not_pqrcpbegin ...............***Timeout 118.10 sec Start 2951: mpi_dst_example_simple_lap_z_facto2_sched1_not_pqrcpbegin 2652/3626 Test #2952: mpi_dst_example_simple_lap_z_facto2_sched1_not_pqrcpend .................***Timeout 118.13 sec Start 2952: mpi_dst_example_simple_lap_z_facto2_sched1_not_pqrcpend 2652/3626 Test #2953: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpbegin ..............***Timeout 118.15 sec Start 2953: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpbegin 2652/3626 Test #2954: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpend ................***Timeout 118.17 sec Start 2954: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpend 2652/3626 Test #2955: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpbegin ...***Timeout 118.18 sec Start 2955: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpbegin 2652/3626 Test #2956: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpend .....***Timeout 118.19 sec Start 2956: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpend 2652/3626 Test #2957: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrcpbegin ...............***Timeout 118.24 sec Start 2957: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrcpbegin 2652/3626 Test #2958: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrcpend .................***Timeout 118.30 sec Start 2958: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrcpend 2652/3626 Test #2959: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrcpbegin ..............***Timeout 118.35 sec Start 2959: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrcpbegin 2652/3626 Test #2960: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrcpend ................***Timeout 118.38 sec Start 2960: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrcpend 2652/3626 Test #2961: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpbegin ...***Timeout 118.42 sec Start 2961: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpbegin 2652/3626 Test #2962: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpend .....***Timeout 118.50 sec Start 2962: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpend 2652/3626 Test #2963: mpi_dst_example_simple_lap_z_facto2_sched1_not_tqrcpbegin ...............***Timeout 118.60 sec Start 2963: mpi_dst_example_simple_lap_z_facto2_sched1_not_tqrcpbegin 2652/3626 Test #2964: mpi_dst_example_simple_lap_z_facto2_sched1_not_tqrcpend .................***Timeout 118.69 sec Start 2964: mpi_dst_example_simple_lap_z_facto2_sched1_not_tqrcpend 2652/3626 Test #2965: mpi_dst_example_simple_lap_z_facto2_sched1_kway_tqrcpbegin ..............***Timeout 118.73 sec Start 2965: mpi_dst_example_simple_lap_z_facto2_sched1_kway_tqrcpbegin 2652/3626 Test #2966: mpi_dst_example_simple_lap_z_facto2_sched1_kway_tqrcpend ................***Timeout 118.80 sec Start 2966: mpi_dst_example_simple_lap_z_facto2_sched1_kway_tqrcpend 2652/3626 Test #2967: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpbegin ...***Timeout 118.81 sec Start 2967: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpbegin 2652/3626 Test #2968: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpend .....***Timeout 118.83 sec Start 2968: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpend 2652/3626 Test #2969: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrrtbegin ...............***Timeout 118.83 sec Start 2969: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrrtbegin 2652/3626 Test #2970: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrrtend .................***Timeout 118.90 sec Start 2970: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrrtend 2652/3626 Test #2971: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrrtbegin ..............***Timeout 118.98 sec Start 2971: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrrtbegin 2652/3626 Test #2972: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrrtend ................***Timeout 119.06 sec Start 2972: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrrtend 2652/3626 Test #2973: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtbegin ...***Timeout 119.10 sec Start 2973: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtbegin 2652/3626 Test #2974: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtend .....***Timeout 119.13 sec Start 2974: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtend 2652/3626 Test #2975: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpilu0 ...............***Timeout 119.15 sec Start 2975: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpilu0 2652/3626 Test #2976: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpilu1 ...............***Timeout 119.17 sec Start 2976: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpilu1 2652/3626 Test #2977: mpi_dst_example_simple_lap_z_facto3_sched1_not_svdbegin .................***Timeout 119.17 sec Start 2977: mpi_dst_example_simple_lap_z_facto3_sched1_not_svdbegin 2652/3626 Test #2978: mpi_dst_example_simple_lap_z_facto3_sched1_not_svdend ...................***Timeout 119.18 sec Start 2978: mpi_dst_example_simple_lap_z_facto3_sched1_not_svdend 2652/3626 Test #2979: mpi_dst_example_simple_lap_z_facto3_sched1_kway_svdbegin ................***Timeout 119.30 sec Start 2979: mpi_dst_example_simple_lap_z_facto3_sched1_kway_svdbegin 2652/3626 Test #2980: mpi_dst_example_simple_lap_z_facto3_sched1_kway_svdend ..................***Timeout 119.39 sec Start 2980: mpi_dst_example_simple_lap_z_facto3_sched1_kway_svdend 2652/3626 Test #2981: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_svdbegin .....***Timeout 119.45 sec Start 2981: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_svdbegin 2652/3626 Test #2982: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_svdend .......***Timeout 119.48 sec Start 2982: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_svdend 2652/3626 Test #2983: mpi_dst_example_simple_lap_z_facto3_sched1_not_pqrcpbegin ...............***Timeout 119.49 sec Start 2983: mpi_dst_example_simple_lap_z_facto3_sched1_not_pqrcpbegin 2652/3626 Test #2984: mpi_dst_example_simple_lap_z_facto3_sched1_not_pqrcpend .................***Timeout 119.57 sec Start 2984: mpi_dst_example_simple_lap_z_facto3_sched1_not_pqrcpend 2652/3626 Test #2985: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpbegin ..............***Timeout 119.63 sec Start 2985: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpbegin 2652/3626 Test #2986: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpend ................***Timeout 119.69 sec Start 2986: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpend 2652/3626 Test #2987: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpbegin ...***Timeout 119.81 sec Start 2987: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpbegin 2652/3626 Test #2988: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpend .....***Timeout 119.82 sec Start 2988: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpend 2652/3626 Test #2989: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrcpbegin ...............***Timeout 119.83 sec Start 2989: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrcpbegin 2652/3626 Test #2990: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrcpend .................***Timeout 119.85 sec Start 2990: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrcpend 2652/3626 Test #2991: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrcpbegin ..............***Timeout 119.86 sec Start 2991: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrcpbegin 2652/3626 Test #2992: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrcpend ................***Timeout 119.86 sec Start 2992: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrcpend 2652/3626 Test #2993: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpbegin ...***Timeout 119.87 sec Start 2993: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpbegin 2652/3626 Test #2994: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpend .....***Timeout 119.88 sec Start 2994: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpend 2652/3626 Test #2995: mpi_dst_example_simple_lap_z_facto3_sched1_not_tqrcpbegin ...............***Timeout 119.89 sec Start 2995: mpi_dst_example_simple_lap_z_facto3_sched1_not_tqrcpbegin 2652/3626 Test #2996: mpi_dst_example_simple_lap_z_facto3_sched1_not_tqrcpend .................***Timeout 119.90 sec Start 2996: mpi_dst_example_simple_lap_z_facto3_sched1_not_tqrcpend 2652/3626 Test #2997: mpi_dst_example_simple_lap_z_facto3_sched1_kway_tqrcpbegin ..............***Timeout 119.93 sec Start 2997: mpi_dst_example_simple_lap_z_facto3_sched1_kway_tqrcpbegin 2652/3626 Test #2998: mpi_dst_example_simple_lap_z_facto3_sched1_kway_tqrcpend ................***Timeout 119.97 sec Start 2998: mpi_dst_example_simple_lap_z_facto3_sched1_kway_tqrcpend 2652/3626 Test #2999: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpbegin ...***Timeout 120.03 sec Start 2999: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpbegin 2652/3626 Test #3000: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpend .....***Timeout 120.04 sec Start 3000: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpend 2652/3626 Test #3001: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrrtbegin ...............***Timeout 120.11 sec Start 3001: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrrtbegin 2652/3626 Test #3002: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrrtend .................***Timeout 120.13 sec Start 3002: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrrtend 2652/3626 Test #3003: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrrtbegin ..............***Timeout 120.14 sec Start 3003: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrrtbegin 2652/3626 Test #3004: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrrtend ................***Timeout 120.15 sec Start 3004: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrrtend 2652/3626 Test #3005: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtbegin ...***Timeout 120.23 sec Start 3005: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtbegin 2652/3626 Test #3006: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtend .....***Timeout 120.38 sec Start 3006: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtend 2652/3626 Test #3007: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpilu0 ...............***Timeout 120.48 sec Start 3007: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpilu0 2652/3626 Test #3008: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpilu1 ...............***Timeout 120.53 sec Start 3008: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpilu1 2652/3626 Test #3009: mpi_dst_example_simple_lap_z_facto4_sched1_not_svdbegin .................***Timeout 120.58 sec Start 3009: mpi_dst_example_simple_lap_z_facto4_sched1_not_svdbegin 2652/3626 Test #3010: mpi_dst_example_simple_lap_z_facto4_sched1_not_svdend ...................***Timeout 120.62 sec Start 3010: mpi_dst_example_simple_lap_z_facto4_sched1_not_svdend 2652/3626 Test #3011: mpi_dst_example_simple_lap_z_facto4_sched1_kway_svdbegin ................***Timeout 120.65 sec Start 3011: mpi_dst_example_simple_lap_z_facto4_sched1_kway_svdbegin 2652/3626 Test #3012: mpi_dst_example_simple_lap_z_facto4_sched1_kway_svdend ..................***Timeout 120.70 sec Start 3012: mpi_dst_example_simple_lap_z_facto4_sched1_kway_svdend 2652/3626 Test #3013: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_svdbegin .....***Timeout 120.71 sec Start 3013: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_svdbegin 2652/3626 Test #3014: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_svdend .......***Timeout 120.77 sec Start 3014: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_svdend 2652/3626 Test #3015: mpi_dst_example_simple_lap_z_facto4_sched1_not_pqrcpbegin ...............***Timeout 120.81 sec Start 3015: mpi_dst_example_simple_lap_z_facto4_sched1_not_pqrcpbegin 2652/3626 Test #3016: mpi_dst_example_simple_lap_z_facto4_sched1_not_pqrcpend .................***Timeout 120.82 sec Start 3016: mpi_dst_example_simple_lap_z_facto4_sched1_not_pqrcpend 2652/3626 Test #3017: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpbegin ..............***Timeout 120.83 sec Start 3017: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpbegin 2652/3626 Test #3018: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpend ................***Timeout 120.88 sec Start 3018: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpend 2652/3626 Test #3019: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpbegin ...***Timeout 120.92 sec Start 3019: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpbegin 2652/3626 Test #3020: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpend .....***Timeout 120.97 sec Start 3020: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpend 2652/3626 Test #3021: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrcpbegin ...............***Timeout 121.00 sec Start 3021: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrcpbegin 2652/3626 Test #3022: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrcpend .................***Timeout 121.02 sec Start 3022: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrcpend 2652/3626 Test #3023: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrcpbegin ..............***Timeout 121.05 sec Start 3023: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrcpbegin 2652/3626 Test #3024: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrcpend ................***Timeout 121.08 sec Start 3024: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrcpend 2652/3626 Test #3025: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpbegin ...***Timeout 121.12 sec Start 3025: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpbegin 2652/3626 Test #3026: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpend .....***Timeout 121.15 sec Start 3026: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpend 2652/3626 Test #3027: mpi_dst_example_simple_lap_z_facto4_sched1_not_tqrcpbegin ...............***Timeout 121.16 sec Start 3027: mpi_dst_example_simple_lap_z_facto4_sched1_not_tqrcpbegin 2652/3626 Test #3028: mpi_dst_example_simple_lap_z_facto4_sched1_not_tqrcpend .................***Timeout 121.17 sec Start 3028: mpi_dst_example_simple_lap_z_facto4_sched1_not_tqrcpend 2652/3626 Test #3029: mpi_dst_example_simple_lap_z_facto4_sched1_kway_tqrcpbegin ..............***Timeout 121.19 sec Start 3029: mpi_dst_example_simple_lap_z_facto4_sched1_kway_tqrcpbegin 2652/3626 Test #3030: mpi_dst_example_simple_lap_z_facto4_sched1_kway_tqrcpend ................***Timeout 121.21 sec Start 3030: mpi_dst_example_simple_lap_z_facto4_sched1_kway_tqrcpend 2652/3626 Test #3031: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpbegin ...***Timeout 121.23 sec Start 3031: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpbegin 2652/3626 Test #3032: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpend .....***Timeout 121.25 sec Start 3032: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpend 2652/3626 Test #3033: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrrtbegin ...............***Timeout 121.25 sec Start 3033: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrrtbegin 2652/3626 Test #3034: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrrtend .................***Timeout 121.26 sec Start 3034: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrrtend 2652/3626 Test #3035: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrrtbegin ..............***Timeout 121.29 sec Start 3035: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrrtbegin 2652/3626 Test #3036: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrrtend ................***Timeout 121.44 sec Start 3036: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrrtend 2652/3626 Test #3037: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtbegin ...***Timeout 121.48 sec Start 3037: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtbegin 2652/3626 Test #3038: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtend .....***Timeout 121.49 sec Start 3038: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtend 2652/3626 Test #3039: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpilu0 ...............***Timeout 121.50 sec Start 3039: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpilu0 2652/3626 Test #3040: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpilu1 ...............***Timeout 121.51 sec Start 3040: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpilu1 2652/3626 Test #3041: mpi_dst_example_simple_lap_s_facto0_sched4_not_svdbegin .................***Timeout 121.53 sec Start 3041: mpi_dst_example_simple_lap_s_facto0_sched4_not_svdbegin 2652/3626 Test #3042: mpi_dst_example_simple_lap_s_facto0_sched4_not_svdend ...................***Timeout 121.58 sec Start 3042: mpi_dst_example_simple_lap_s_facto0_sched4_not_svdend 2652/3626 Test #3043: mpi_dst_example_simple_lap_s_facto0_sched4_kway_svdbegin ................***Timeout 121.62 sec Start 3043: mpi_dst_example_simple_lap_s_facto0_sched4_kway_svdbegin 2652/3626 Test #3044: mpi_dst_example_simple_lap_s_facto0_sched4_kway_svdend ..................***Timeout 121.66 sec Start 3044: mpi_dst_example_simple_lap_s_facto0_sched4_kway_svdend 2652/3626 Test #3045: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_svdbegin .....***Timeout 121.72 sec Start 3045: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_svdbegin 2652/3626 Test #3046: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_svdend .......***Timeout 121.72 sec Start 3046: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_svdend 2652/3626 Test #3047: mpi_dst_example_simple_lap_s_facto0_sched4_not_pqrcpbegin ...............***Timeout 121.76 sec Start 3047: mpi_dst_example_simple_lap_s_facto0_sched4_not_pqrcpbegin 2652/3626 Test #3048: mpi_dst_example_simple_lap_s_facto0_sched4_not_pqrcpend .................***Timeout 121.80 sec Start 3048: mpi_dst_example_simple_lap_s_facto0_sched4_not_pqrcpend 2652/3626 Test #3049: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpbegin ..............***Timeout 121.81 sec Start 3049: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpbegin 2652/3626 Test #3050: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpend ................***Timeout 121.82 sec Start 3050: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpend 2652/3626 Test #3051: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpbegin ...***Timeout 121.99 sec Start 3051: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpbegin 2652/3626 Test #3052: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpend .....***Timeout 122.09 sec Start 3052: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpend 2652/3626 Test #3053: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrcpbegin ...............***Timeout 122.20 sec Start 3053: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrcpbegin 2652/3626 Test #3054: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrcpend .................***Timeout 122.27 sec Start 3054: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrcpend 2652/3626 Test #3055: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrcpbegin ..............***Timeout 122.29 sec Start 3055: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrcpbegin 2652/3626 Test #3056: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrcpend ................***Timeout 122.31 sec Start 3056: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrcpend 2652/3626 Test #3057: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpbegin ...***Timeout 122.34 sec Start 3057: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpbegin 2652/3626 Test #3058: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpend .....***Timeout 122.37 sec Start 3058: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpend 2652/3626 Test #3059: mpi_dst_example_simple_lap_s_facto0_sched4_not_tqrcpbegin ...............***Timeout 122.46 sec Start 3059: mpi_dst_example_simple_lap_s_facto0_sched4_not_tqrcpbegin 2652/3626 Test #3060: mpi_dst_example_simple_lap_s_facto0_sched4_not_tqrcpend .................***Timeout 122.51 sec Start 3060: mpi_dst_example_simple_lap_s_facto0_sched4_not_tqrcpend 2652/3626 Test #3061: mpi_dst_example_simple_lap_s_facto0_sched4_kway_tqrcpbegin ..............***Timeout 122.53 sec Start 3061: mpi_dst_example_simple_lap_s_facto0_sched4_kway_tqrcpbegin 2652/3626 Test #3062: mpi_dst_example_simple_lap_s_facto0_sched4_kway_tqrcpend ................***Timeout 122.61 sec Start 3062: mpi_dst_example_simple_lap_s_facto0_sched4_kway_tqrcpend Start 3159: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpbegin Start 3160: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpend Start 3161: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrrtbegin Start 3162: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrrtend Start 3163: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrrtbegin Test #2640: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrcpend ................ Passed 141.72 sec Start 3164: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrrtend Test #2641: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpbegin ... Passed 147.29 sec Start 3165: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtbegin Test #2813: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtbegin ...***Failed 147.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 [arch-nspawn-3655178:3260837] *** Process received signal *** [arch-nspawn-3655178:3260837] Signal: Segmentation fault (11) [arch-nspawn-3655178:3260837] Signal code: Address not mapped (1) [arch-nspawn-3655178:3260837] Failing at address: 0x7f23afca1760 [arch-nspawn-3655178:3260837] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7fc8ba15f6cc] [arch-nspawn-3655178:3260837] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7fc8b809aa02] [arch-nspawn-3655178:3260837] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7fc8b809b504] [arch-nspawn-3655178:3260837] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7fc8b804ca7a] [arch-nspawn-3655178:3260837] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7fc8b8079aa2] [arch-nspawn-3655178:3260837] [ 5] /usr/lib/libmpi.so.40(+0x7de1a) [0x7fc8b887de1a] [arch-nspawn-3655178:3260837] [ 6] /usr/lib/libmpi.so.40(ompi_request_default_wait+0x1a) [0x7fc8b888019c] [arch-nspawn-3655178:3260837] [ 7] /usr/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0x98) [0x7fc8b88f03e8] [arch-nspawn-3655178:3260837] [ 8] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_recursivedoubling+0x210) [0x7fc8b88f1a88] [arch-nspawn-3655178:3260837] [ 9] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_ring+0x3fc) [0x7fc8b88f443c] [arch-nspawn-3655178:3260837] [10] /usr/lib/libmpi.so.40(ompi_coll_tuned_allreduce_intra_dec_fixed+0x40) [0x7fc8b8915152] [arch-nspawn-3655178:3260837] [11] /usr/lib/libmpi.so.40(MPI_Allreduce+0x294) [0x7fc8b888e584] [arch-nspawn-3655178:3260837] [12] /build/pastix/src/build/spm/src/libspm.so.1(spmUpdateComputedFields+0x140) [0x7fc8b8bdb458] [arch-nspawn-3655178:3260837] [13] /build/pastix/src/build/spm/src/libspm.so.1(genLaplacian+0xaa) [0x7fc8b8be421e] [arch-nspawn-3655178:3260837] [14] /build/pastix/src/build/spm/src/libspm.so.1(+0x409c8) [0x7fc8b8be59c8] [arch-nspawn-3655178:3260837] [15] ./simple(+0xe2c) [0x555555556e2c] [arch-nspawn-3655178:3260837] [16] /usr/lib/libc.so.6(+0x27fae) [0x7fc8b86a4fae] [arch-nspawn-3655178:3260837] [17] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7fc8b86a50b8] [arch-nspawn-3655178:3260837] [18] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:3260837] *** End of error message *** -------------------------------------------------------------------------- prte noticed that process rank 1 with PID 3260837 on node arch-nspawn-3655178 exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- Start 2813: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtbegin Test #2628: mpi_dst_example_simple_lap_d_facto0_sched1_kway_svdend .................. Passed 148.09 sec Start 3166: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtend Test #2809: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrrtbegin ............... Passed 149.74 sec Start 3167: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpilu0 Test #2897: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpbegin ...***Failed 149.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 [arch-nspawn-3655178:3263629] *** Process received signal *** [arch-nspawn-3655178:3263629] Signal: Segmentation fault (11) [arch-nspawn-3655178:3263629] Signal code: Address not mapped (1) [arch-nspawn-3655178:3263629] Failing at address: 0x7f23afca1860 [arch-nspawn-3655178:3263629] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7fc35b5856cc] [arch-nspawn-3655178:3263629] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7fc35949aa02] [arch-nspawn-3655178:3263629] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7fc35949b504] [arch-nspawn-3655178:3263629] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7fc35944ca7a] [arch-nspawn-3655178:3263629] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7fc359479aa2] [arch-nspawn-3655178:3263629] [ 5] /usr/lib/libmpi.so.40(+0x7de1a) [0x7fc359c7de1a] [arch-nspawn-3655178:3263629] [ 6] /usr/lib/libmpi.so.40(ompi_request_default_wait+0x1a) [0x7fc359c8019c] [arch-nspawn-3655178:3263629] [ 7] /usr/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0x98) [0x7fc359cf03e8] [arch-nspawn-3655178:3263629] [ 8] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_recursivedoubling+0x210) [0x7fc359cf1a88] [arch-nspawn-3655178:3263629] [ 9] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_ring+0x3fc) [0x7fc359cf443c] [arch-nspawn-3655178:3263629] [10] /usr/lib/libmpi.so.40(ompi_coll_tuned_allreduce_intra_dec_fixed+0x40) [0x7fc359d15152] [arch-nspawn-3655178:3263629] [11] /usr/lib/libmpi.so.40(MPI_Allreduce+0x294) [0x7fc359c8e584] [arch-nspawn-3655178:3263629] [12] /build/pastix/src/build/spm/src/libspm.so.1(spmUpdateComputedFields+0x140) [0x7fc359fdb458] [arch-nspawn-3655178:3263629] [13] /build/pastix/src/build/spm/src/libspm.so.1(genLaplacian+0xaa) [0x7fc359fe421e] [arch-nspawn-3655178:3263629] [14] /build/pastix/src/build/spm/src/libspm.so.1(+0x409c8) [0x7fc359fe59c8] [arch-nspawn-3655178:3263629] [15] ./simple(+0xe2c) [0x555555556e2c] [arch-nspawn-3655178:3263629] [16] /usr/lib/libc.so.6(+0x27fae) [0x7fc359aa4fae] [arch-nspawn-3655178:3263629] [17] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7fc359aa50b8] [arch-nspawn-3655178:3263629] [18] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:3263629] *** End of error message *** -------------------------------------------------------------------------- prte noticed that process rank 0 with PID 3263629 on node arch-nspawn-3655178 exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- Start 2897: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpbegin Test #2914: mpi_dst_example_simple_lap_z_facto1_sched1_not_svdend ................... Passed 151.15 sec Start 3168: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpilu1 Test #2775: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_tqrcpbegin ... Passed 160.90 sec Start 3169: mpi_dst_example_simple_lap_d_facto1_sched4_not_svdbegin Test #2738: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrcpend ..... Passed 163.63 sec Start 3170: mpi_dst_example_simple_lap_d_facto1_sched4_not_svdend Test #2708: mpi_dst_example_simple_lap_d_facto2_sched1_not_tqrcpend ................. Passed 165.97 sec Start 3171: mpi_dst_example_simple_lap_d_facto1_sched4_kway_svdbegin Test #2621: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtbegin ... Passed 166.60 sec Start 3172: mpi_dst_example_simple_lap_d_facto1_sched4_kway_svdend Test #2723: mpi_dst_example_simple_lap_c_facto0_sched1_kway_svdbegin ................ Passed 171.80 sec Start 3173: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_svdbegin 2662/3626 Test #3109: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_svdbegin ..... Passed 172.01 sec Start 3174: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_svdend Test #2694: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_svdend ....... Passed 181.08 sec Start 3175: mpi_dst_example_simple_lap_d_facto1_sched4_not_pqrcpbegin Test #2767: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrcpbegin .............. Passed 183.90 sec Start 3176: mpi_dst_example_simple_lap_d_facto1_sched4_not_pqrcpend Test #2770: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrcpend ..... Passed 184.43 sec Start 3177: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpbegin Test #2647: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpbegin ... Passed 189.08 sec Start 3178: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpend Test #2747: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrrtbegin .............. Passed 190.22 sec Start 3179: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpbegin Test #2740: mpi_dst_example_simple_lap_c_facto0_sched1_not_tqrcpend ................. Passed 190.31 sec Test #2876: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrrtend ................ Passed 189.43 sec Start 3180: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpend Start 3181: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrcpbegin 2670/3626 Test #3099: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrrtbegin .............. Passed 189.13 sec Start 3182: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrcpend Test #2845: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtbegin ... Passed 191.25 sec Start 3183: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrcpbegin 2672/3626 Test #3064: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_tqrcpend ..... Passed 192.38 sec Start 3184: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrcpend Test #2724: mpi_dst_example_simple_lap_c_facto0_sched1_kway_svdend .................. Passed 196.22 sec Start 3185: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpbegin 2674/3626 Test #3080: mpi_dst_example_simple_lap_s_facto1_sched4_not_pqrcpend ................. Passed 193.92 sec Start 3186: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpend Test #2861: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrcpbegin ............... Passed 195.70 sec Start 3187: mpi_dst_example_simple_lap_d_facto1_sched4_not_tqrcpbegin Test #2816: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpilu1 ............... Passed 197.27 sec Start 3188: mpi_dst_example_simple_lap_d_facto1_sched4_not_tqrcpend Test #2611: mpi_dst_example_simple_lap_s_facto2_sched1_not_tqrcpbegin ...............***Timeout 335.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2612: mpi_dst_example_simple_lap_s_facto2_sched1_not_tqrcpend .................***Timeout 335.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2614: mpi_dst_example_simple_lap_s_facto2_sched1_kway_tqrcpend ................***Timeout 335.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2620: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrrtend ................***Timeout 335.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.758016e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.543842e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.149910e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.955117e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.889935e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.691416e-01 s Time to initialize coeftab 8.158962e-02 s Time to factorize 2.171360e+00 s ( 4.60 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 3.695096e-01 s Test #2622: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtend .....***Timeout 335.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2633: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpbegin ..............***Timeout 335.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2636: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_pqrcpend .....***Timeout 335.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.668969e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.577133e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.252962e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.251908e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.464306e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.217074e-02 s Time to initialize coeftab 4.889832e-02 s Test #2637: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrcpbegin ...............***Timeout 335.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.160821e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.765837e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.475075e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Test #2649: mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrrtbegin ...............***Timeout 335.53 sec Test #2655: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpilu0 ...............***Timeout 335.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2659: mpi_dst_example_simple_lap_d_facto1_sched1_kway_svdbegin ................***Timeout 335.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2660: mpi_dst_example_simple_lap_d_facto1_sched1_kway_svdend ..................***Timeout 335.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2662: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_svdend .......***Timeout 335.50 sec Test #2667: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpbegin ...***Timeout 335.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2670: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrcpend .................***Timeout 335.48 sec Test #2671: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrcpbegin ..............***Timeout 335.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2678: mpi_dst_example_simple_lap_d_facto1_sched1_kway_tqrcpend ................***Timeout 335.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2684: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrrtend ................***Timeout 335.46 sec ischedInit: The thread number has been automatically set to 256 Test #2686: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtend .....***Timeout 335.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2689: mpi_dst_example_simple_lap_d_facto2_sched1_not_svdbegin .................***Timeout 335.45 sec Test #2696: mpi_dst_example_simple_lap_d_facto2_sched1_not_pqrcpend .................***Timeout 335.44 sec ischedInit: The thread number has been automatically set to 256 Test #2697: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpbegin ..............***Timeout 335.44 sec Test #2700: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpend .....***Timeout 335.43 sec ischedInit: The thread number has been automatically set to 256 Test #2703: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrcpbegin ..............***Timeout 335.42 sec Test #2709: mpi_dst_example_simple_lap_d_facto2_sched1_kway_tqrcpbegin ..............***Timeout 335.40 sec ischedInit: The thread number has been automatically set to 256 Test #2710: mpi_dst_example_simple_lap_d_facto2_sched1_kway_tqrcpend ................***Timeout 335.40 sec Test #2712: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpend .....***Timeout 335.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2714: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrrtend .................***Timeout 335.38 sec Start 2714: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrrtend Test #2716: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrrtend ................***Timeout 335.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2716: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrrtend Test #2717: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtbegin ...***Timeout 335.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2717: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtbegin Test #2718: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtend .....***Timeout 335.38 sec Start 2718: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtend Test #2719: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpilu0 ...............***Timeout 335.38 sec Start 2719: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpilu0 Test #2720: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpilu1 ...............***Timeout 335.39 sec Start 2720: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpilu1 Test #2721: mpi_dst_example_simple_lap_c_facto0_sched1_not_svdbegin .................***Timeout 335.39 sec Start 2721: mpi_dst_example_simple_lap_c_facto0_sched1_not_svdbegin Test #2722: mpi_dst_example_simple_lap_c_facto0_sched1_not_svdend ...................***Timeout 335.39 sec Start 2722: mpi_dst_example_simple_lap_c_facto0_sched1_not_svdend Test #2725: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_svdbegin .....***Timeout 335.37 sec Start 2725: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_svdbegin Test #2726: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_svdend .......***Timeout 335.37 sec Start 2726: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_svdend Test #2727: mpi_dst_example_simple_lap_c_facto0_sched1_not_pqrcpbegin ...............***Timeout 335.37 sec Start 2727: mpi_dst_example_simple_lap_c_facto0_sched1_not_pqrcpbegin Test #2728: mpi_dst_example_simple_lap_c_facto0_sched1_not_pqrcpend .................***Timeout 335.37 sec ischedInit: The thread number has been automatically set to 256 Start 2728: mpi_dst_example_simple_lap_c_facto0_sched1_not_pqrcpend Test #2729: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpbegin ..............***Timeout 335.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2729: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpbegin Test #2730: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpend ................***Timeout 335.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2730: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpend Test #2731: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpbegin ...***Timeout 335.37 sec Start 2731: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpbegin Test #2732: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpend .....***Timeout 335.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2732: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpend Test #2733: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrcpbegin ...............***Timeout 335.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2733: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrcpbegin Test #2734: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrcpend .................***Timeout 335.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2734: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrcpend Test #2735: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrcpbegin ..............***Timeout 335.38 sec Start 2735: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrcpbegin Test #2736: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrcpend ................***Timeout 335.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2736: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrcpend Test #2737: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrcpbegin ...***Timeout 335.38 sec Start 2737: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrcpbegin Test #2739: mpi_dst_example_simple_lap_c_facto0_sched1_not_tqrcpbegin ...............***Timeout 335.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2739: mpi_dst_example_simple_lap_c_facto0_sched1_not_tqrcpbegin Test #2741: mpi_dst_example_simple_lap_c_facto0_sched1_kway_tqrcpbegin ..............***Timeout 335.37 sec Start 2741: mpi_dst_example_simple_lap_c_facto0_sched1_kway_tqrcpbegin Test #2742: mpi_dst_example_simple_lap_c_facto0_sched1_kway_tqrcpend ................***Timeout 335.37 sec Start 2742: mpi_dst_example_simple_lap_c_facto0_sched1_kway_tqrcpend Test #2743: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpbegin ...***Timeout 335.37 sec Start 2743: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpbegin Test #2744: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpend .....***Timeout 335.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2744: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpend Test #2745: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrrtbegin ...............***Timeout 335.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2745: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrrtbegin Test #2746: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrrtend .................***Timeout 335.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2746: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrrtend Test #2748: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrrtend ................***Timeout 335.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2748: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrrtend Test #2749: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtbegin ...***Timeout 335.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.674825e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.560121e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.988809e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.096434e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.928266e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.924703e-01 s Time to initialize coeftab 3.830038e-01 s Time to factorize 4.435833e+00 s ( 4.57 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2749: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtbegin Test #2750: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtend .....***Timeout 335.36 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2750: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtend Test #2752: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpilu1 ...............***Timeout 335.39 sec Start 2752: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpilu1 Test #2753: mpi_dst_example_simple_lap_c_facto1_sched1_not_svdbegin .................***Timeout 335.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.965768e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.222247e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.328954e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.083128e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.992155e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.993491e-01 s Time to initialize coeftab 3.146172e-01 s Time to factorize 1.123010e+01 s ( 1.90 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.3 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.905291e-01 s Start 2753: mpi_dst_example_simple_lap_c_facto1_sched1_not_svdbegin Test #2754: mpi_dst_example_simple_lap_c_facto1_sched1_not_svdend ...................***Timeout 335.40 sec Start 2754: mpi_dst_example_simple_lap_c_facto1_sched1_not_svdend Test #2755: mpi_dst_example_simple_lap_c_facto1_sched1_kway_svdbegin ................***Timeout 335.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2755: mpi_dst_example_simple_lap_c_facto1_sched1_kway_svdbegin Test #2756: mpi_dst_example_simple_lap_c_facto1_sched1_kway_svdend ..................***Timeout 335.40 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2756: mpi_dst_example_simple_lap_c_facto1_sched1_kway_svdend Test #2757: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_svdbegin .....***Timeout 335.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2757: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_svdbegin Test #2758: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_svdend .......***Timeout 335.40 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2758: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_svdend Test #2759: mpi_dst_example_simple_lap_c_facto1_sched1_not_pqrcpbegin ...............***Timeout 335.40 sec ischedInit: The thread number has been automatically set to 256 Start 2759: mpi_dst_example_simple_lap_c_facto1_sched1_not_pqrcpbegin Test #2760: mpi_dst_example_simple_lap_c_facto1_sched1_not_pqrcpend .................***Timeout 335.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2760: mpi_dst_example_simple_lap_c_facto1_sched1_not_pqrcpend Test #2761: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpbegin ..............***Timeout 335.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2761: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpbegin Test #2762: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpend ................***Timeout 335.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2762: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpend Test #2763: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpbegin ...***Timeout 335.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2763: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpbegin Test #2764: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpend .....***Timeout 335.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2764: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpend Test #2765: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrcpbegin ...............***Timeout 335.43 sec Start 2765: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrcpbegin Test #2766: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrcpend .................***Timeout 335.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2766: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrcpend Test #2768: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrcpend ................***Timeout 335.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2768: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrcpend Test #2769: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrcpbegin ...***Timeout 335.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2769: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrcpbegin Test #2771: mpi_dst_example_simple_lap_c_facto1_sched1_not_tqrcpbegin ...............***Timeout 335.42 sec Start 2771: mpi_dst_example_simple_lap_c_facto1_sched1_not_tqrcpbegin Test #2772: mpi_dst_example_simple_lap_c_facto1_sched1_not_tqrcpend .................***Timeout 335.42 sec Start 2772: mpi_dst_example_simple_lap_c_facto1_sched1_not_tqrcpend Test #2773: mpi_dst_example_simple_lap_c_facto1_sched1_kway_tqrcpbegin ..............***Timeout 335.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2773: mpi_dst_example_simple_lap_c_facto1_sched1_kway_tqrcpbegin Test #2774: mpi_dst_example_simple_lap_c_facto1_sched1_kway_tqrcpend ................***Timeout 335.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2774: mpi_dst_example_simple_lap_c_facto1_sched1_kway_tqrcpend Test #2776: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_tqrcpend .....***Timeout 335.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2776: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_tqrcpend Test #2777: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrrtbegin ...............***Timeout 335.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2777: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrrtbegin Test #2778: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrrtend .................***Timeout 335.42 sec Start 2778: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrrtend Test #2779: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrrtbegin ..............***Timeout 335.42 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2779: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrrtbegin Test #2780: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrrtend ................***Timeout 335.42 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2780: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrrtend Test #2781: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrrtbegin ...***Timeout 335.42 sec Start 2781: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrrtbegin Test #2784: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpilu1 ...............***Timeout 335.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.976259e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.906483e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.470868e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.368678e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.968986e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.260556e-01 s Time to initialize coeftab 1.642998e+00 s Time to factorize 2.442493e+00 s ( 8.72 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 2784: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpilu1 Test #2785: mpi_dst_example_simple_lap_c_facto2_sched1_not_svdbegin .................***Timeout 335.42 sec ischedInit: The thread number has been automatically set to 256 Start 2785: mpi_dst_example_simple_lap_c_facto2_sched1_not_svdbegin Test #2786: mpi_dst_example_simple_lap_c_facto2_sched1_not_svdend ...................***Timeout 335.43 sec Start 2786: mpi_dst_example_simple_lap_c_facto2_sched1_not_svdend Test #2787: mpi_dst_example_simple_lap_c_facto2_sched1_kway_svdbegin ................***Timeout 335.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2787: mpi_dst_example_simple_lap_c_facto2_sched1_kway_svdbegin Test #2789: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_svdbegin .....***Timeout 335.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2789: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_svdbegin Test #2791: mpi_dst_example_simple_lap_c_facto2_sched1_not_pqrcpbegin ...............***Timeout 335.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2791: mpi_dst_example_simple_lap_c_facto2_sched1_not_pqrcpbegin Test #2792: mpi_dst_example_simple_lap_c_facto2_sched1_not_pqrcpend .................***Timeout 335.46 sec Start 2792: mpi_dst_example_simple_lap_c_facto2_sched1_not_pqrcpend Test #2793: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpbegin ..............***Timeout 335.47 sec ischedInit: The thread number has been automatically set to 256 Start 2793: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpbegin Test #2794: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpend ................***Timeout 335.47 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2794: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpend Test #2795: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpbegin ...***Timeout 335.48 sec Start 2795: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpbegin Test #2796: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpend .....***Timeout 335.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2796: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpend Test #2797: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrcpbegin ...............***Timeout 335.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2797: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrcpbegin Test #2798: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrcpend .................***Timeout 335.53 sec Start 2798: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrcpend Test #2799: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrcpbegin ..............***Timeout 335.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2799: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrcpbegin Test #2800: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrcpend ................***Timeout 335.54 sec Start 2800: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrcpend Test #2801: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpbegin ...***Timeout 335.65 sec Start 2801: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpbegin Test #2802: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpend .....***Timeout 335.88 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2802: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpend Test #2803: mpi_dst_example_simple_lap_c_facto2_sched1_not_tqrcpbegin ...............***Timeout 335.95 sec Start 2803: mpi_dst_example_simple_lap_c_facto2_sched1_not_tqrcpbegin Test #2804: mpi_dst_example_simple_lap_c_facto2_sched1_not_tqrcpend .................***Timeout 336.03 sec ischedInit: The thread number has been automatically set to 256 Start 2804: mpi_dst_example_simple_lap_c_facto2_sched1_not_tqrcpend Test #2805: mpi_dst_example_simple_lap_c_facto2_sched1_kway_tqrcpbegin ..............***Timeout 336.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2805: mpi_dst_example_simple_lap_c_facto2_sched1_kway_tqrcpbegin Test #2806: mpi_dst_example_simple_lap_c_facto2_sched1_kway_tqrcpend ................***Timeout 336.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2806: mpi_dst_example_simple_lap_c_facto2_sched1_kway_tqrcpend Test #2807: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpbegin ...***Timeout 336.05 sec Start 2807: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpbegin Test #2808: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpend .....***Timeout 336.06 sec Start 2808: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpend Test #2810: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrrtend .................***Timeout 336.07 sec Start 2810: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrrtend Test #2811: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrrtbegin ..............***Timeout 336.12 sec Start 2811: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrrtbegin Test #2812: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrrtend ................***Timeout 336.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2812: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrrtend Test #2814: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtend .....***Timeout 336.21 sec ischedInit: The thread number has been automatically set to 256 Start 2814: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtend Test #2815: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpilu0 ...............***Timeout 336.22 sec Start 2815: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpilu0 Test #2817: mpi_dst_example_simple_lap_c_facto3_sched1_not_svdbegin .................***Timeout 336.22 sec Start 2817: mpi_dst_example_simple_lap_c_facto3_sched1_not_svdbegin Test #2818: mpi_dst_example_simple_lap_c_facto3_sched1_not_svdend ...................***Timeout 336.24 sec Start 2818: mpi_dst_example_simple_lap_c_facto3_sched1_not_svdend Test #2819: mpi_dst_example_simple_lap_c_facto3_sched1_kway_svdbegin ................***Timeout 336.25 sec ischedInit: The thread number has been automatically set to 256 Start 2819: mpi_dst_example_simple_lap_c_facto3_sched1_kway_svdbegin Test #2821: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_svdbegin .....***Timeout 336.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2821: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_svdbegin Test #2823: mpi_dst_example_simple_lap_c_facto3_sched1_not_pqrcpbegin ...............***Timeout 336.36 sec Start 2823: mpi_dst_example_simple_lap_c_facto3_sched1_not_pqrcpbegin Test #2826: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpend ................***Timeout 336.36 sec ischedInit: The thread number has been automatically set to 256 Start 2826: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpend Test #2827: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpbegin ...***Timeout 336.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2827: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpbegin Test #2828: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpend .....***Timeout 336.36 sec ischedInit: The thread number has been automatically set to 256 Start 2828: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpend Test #2829: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrcpbegin ...............***Timeout 336.36 sec Start 2829: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrcpbegin Test #2830: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrcpend .................***Timeout 336.41 sec Start 2830: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrcpend Test #2831: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrcpbegin ..............***Timeout 336.41 sec ischedInit: The thread number has been automatically set to 256 Start 2831: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrcpbegin Test #2832: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrcpend ................***Timeout 336.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2832: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrcpend Test #2833: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpbegin ...***Timeout 336.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.595771e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.192831e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.072492e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.203063e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.296395e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.245640e-02 s Time to initialize coeftab 1.604850e-01 s Time to factorize 2.815425e+00 s ( 7.20 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Start 2833: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpbegin Test #2834: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpend .....***Timeout 336.48 sec ischedInit: The thread number has been automatically set to 256 Start 2834: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpend Test #2835: mpi_dst_example_simple_lap_c_facto3_sched1_not_tqrcpbegin ...............***Timeout 336.57 sec Start 2835: mpi_dst_example_simple_lap_c_facto3_sched1_not_tqrcpbegin Test #2836: mpi_dst_example_simple_lap_c_facto3_sched1_not_tqrcpend .................***Timeout 336.58 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2836: mpi_dst_example_simple_lap_c_facto3_sched1_not_tqrcpend Test #2837: mpi_dst_example_simple_lap_c_facto3_sched1_kway_tqrcpbegin ..............***Timeout 336.60 sec Start 2837: mpi_dst_example_simple_lap_c_facto3_sched1_kway_tqrcpbegin Test #2838: mpi_dst_example_simple_lap_c_facto3_sched1_kway_tqrcpend ................***Timeout 336.63 sec Start 2838: mpi_dst_example_simple_lap_c_facto3_sched1_kway_tqrcpend Test #2839: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpbegin ...***Timeout 336.64 sec Start 2839: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpbegin Test #2841: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrrtbegin ...............***Timeout 336.68 sec Start 2841: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrrtbegin Test #2842: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrrtend .................***Timeout 336.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2842: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrrtend Test #2843: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrrtbegin ..............***Timeout 336.76 sec Start 2843: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrrtbegin Test #2844: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrrtend ................***Timeout 336.80 sec Start 2844: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrrtend Test #2846: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtend .....***Timeout 336.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2846: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtend Test #2847: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpilu0 ...............***Timeout 336.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2847: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpilu0 Test #2848: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpilu1 ...............***Timeout 336.94 sec Start 2848: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpilu1 Test #2849: mpi_dst_example_simple_lap_c_facto4_sched1_not_svdbegin .................***Timeout 337.01 sec Start 2849: mpi_dst_example_simple_lap_c_facto4_sched1_not_svdbegin Test #2850: mpi_dst_example_simple_lap_c_facto4_sched1_not_svdend ...................***Timeout 337.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2850: mpi_dst_example_simple_lap_c_facto4_sched1_not_svdend Test #2851: mpi_dst_example_simple_lap_c_facto4_sched1_kway_svdbegin ................***Timeout 337.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2851: mpi_dst_example_simple_lap_c_facto4_sched1_kway_svdbegin Test #2853: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_svdbegin .....***Timeout 337.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2853: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_svdbegin Test #2854: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_svdend .......***Timeout 337.47 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2854: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_svdend Test #2855: mpi_dst_example_simple_lap_c_facto4_sched1_not_pqrcpbegin ...............***Timeout 337.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2855: mpi_dst_example_simple_lap_c_facto4_sched1_not_pqrcpbegin Test #2856: mpi_dst_example_simple_lap_c_facto4_sched1_not_pqrcpend .................***Timeout 337.54 sec Start 2856: mpi_dst_example_simple_lap_c_facto4_sched1_not_pqrcpend Test #2858: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpend ................***Timeout 337.56 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2858: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpend Test #2860: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpend .....***Timeout 337.60 sec ischedInit: The thread number has been automatically set to 256 Start 2860: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpend Test #2862: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrcpend .................***Timeout 337.71 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2862: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrcpend Test #2863: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrcpbegin ..............***Timeout 337.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2863: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrcpbegin Test #2864: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrcpend ................***Timeout 337.75 sec Start 2864: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrcpend Test #2865: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpbegin ...***Timeout 337.77 sec Start 2865: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpbegin Test #2866: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpend .....***Timeout 337.80 sec Start 2866: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpend Test #2867: mpi_dst_example_simple_lap_c_facto4_sched1_not_tqrcpbegin ...............***Timeout 337.81 sec ischedInit: The thread number has been automatically set to 256 Start 2867: mpi_dst_example_simple_lap_c_facto4_sched1_not_tqrcpbegin Test #2868: mpi_dst_example_simple_lap_c_facto4_sched1_not_tqrcpend .................***Timeout 337.95 sec Start 2868: mpi_dst_example_simple_lap_c_facto4_sched1_not_tqrcpend Test #2869: mpi_dst_example_simple_lap_c_facto4_sched1_kway_tqrcpbegin ..............***Timeout 338.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2869: mpi_dst_example_simple_lap_c_facto4_sched1_kway_tqrcpbegin Test #2870: mpi_dst_example_simple_lap_c_facto4_sched1_kway_tqrcpend ................***Timeout 338.20 sec Start 2870: mpi_dst_example_simple_lap_c_facto4_sched1_kway_tqrcpend Test #2871: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpbegin ...***Timeout 338.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2871: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpbegin Test #2872: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpend .....***Timeout 338.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2872: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpend Test #2873: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrrtbegin ...............***Timeout 338.40 sec Start 2873: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrrtbegin Test #2874: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrrtend .................***Timeout 338.43 sec ischedInit: The thread number has been automatically set to 256 Start 2874: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrrtend Test #2875: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrrtbegin ..............***Timeout 338.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2875: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrrtbegin Test #2877: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtbegin ...***Timeout 338.53 sec Start 2877: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtbegin Test #2878: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtend .....***Timeout 338.61 sec ischedInit: The thread number has been automatically set to 256 Start 2878: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtend Test #2879: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpilu0 ...............***Timeout 338.70 sec ischedInit: The thread number has been automatically set to 256 Start 2879: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpilu0 Test #2880: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpilu1 ...............***Timeout 338.80 sec Start 2880: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpilu1 Test #2881: mpi_dst_example_simple_lap_z_facto0_sched1_not_svdbegin .................***Timeout 338.89 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2881: mpi_dst_example_simple_lap_z_facto0_sched1_not_svdbegin Test #2882: mpi_dst_example_simple_lap_z_facto0_sched1_not_svdend ...................***Timeout 338.94 sec Start 2882: mpi_dst_example_simple_lap_z_facto0_sched1_not_svdend Test #2883: mpi_dst_example_simple_lap_z_facto0_sched1_kway_svdbegin ................***Timeout 339.05 sec Start 2883: mpi_dst_example_simple_lap_z_facto0_sched1_kway_svdbegin Test #2884: mpi_dst_example_simple_lap_z_facto0_sched1_kway_svdend ..................***Timeout 339.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2884: mpi_dst_example_simple_lap_z_facto0_sched1_kway_svdend Test #2885: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_svdbegin .....***Timeout 339.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2885: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_svdbegin Test #2886: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_svdend .......***Timeout 339.23 sec Start 2886: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_svdend Test #2887: mpi_dst_example_simple_lap_z_facto0_sched1_not_pqrcpbegin ...............***Timeout 339.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2887: mpi_dst_example_simple_lap_z_facto0_sched1_not_pqrcpbegin Test #2888: mpi_dst_example_simple_lap_z_facto0_sched1_not_pqrcpend .................***Timeout 339.39 sec Start 2888: mpi_dst_example_simple_lap_z_facto0_sched1_not_pqrcpend Test #2889: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpbegin ..............***Timeout 339.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2889: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpbegin Test #2890: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpend ................***Timeout 339.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2890: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpend Test #2891: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpbegin ...***Timeout 339.60 sec Start 2891: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpbegin Test #2892: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpend .....***Timeout 339.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.981538e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.098456e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.307271e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.807920e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.280898e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.862790e-01 s Time to initialize coeftab 8.237952e-02 s Time to factorize 9.022278e-01 s (22.48 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 2892: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpend Test #2893: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrcpbegin ...............***Timeout 339.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2893: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrcpbegin Test #2894: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrcpend .................***Timeout 339.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2894: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrcpend Test #2895: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrcpbegin ..............***Timeout 339.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2895: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrcpbegin Test #2896: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrcpend ................***Timeout 339.68 sec ischedInit: The thread number has been automatically set to 256 Start 2896: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrcpend Test #2898: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpend .....***Timeout 339.70 sec ischedInit: The thread number has been automatically set to 256 Start 2898: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpend Test #2899: mpi_dst_example_simple_lap_z_facto0_sched1_not_tqrcpbegin ...............***Timeout 339.74 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2899: mpi_dst_example_simple_lap_z_facto0_sched1_not_tqrcpbegin Test #2900: mpi_dst_example_simple_lap_z_facto0_sched1_not_tqrcpend .................***Timeout 339.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2900: mpi_dst_example_simple_lap_z_facto0_sched1_not_tqrcpend Test #2901: mpi_dst_example_simple_lap_z_facto0_sched1_kway_tqrcpbegin ..............***Timeout 339.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2901: mpi_dst_example_simple_lap_z_facto0_sched1_kway_tqrcpbegin Test #2902: mpi_dst_example_simple_lap_z_facto0_sched1_kway_tqrcpend ................***Timeout 339.90 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2902: mpi_dst_example_simple_lap_z_facto0_sched1_kway_tqrcpend Test #2903: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpbegin ...***Timeout 339.95 sec Start 2903: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpbegin Test #2615: mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpbegin ...***Timeout 340.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2617: mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrrtbegin ...............***Timeout 340.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2619: mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrrtbegin ..............***Timeout 339.99 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2625: mpi_dst_example_simple_lap_d_facto0_sched1_not_svdbegin .................***Timeout 339.97 sec Test #2626: mpi_dst_example_simple_lap_d_facto0_sched1_not_svdend ...................***Timeout 340.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2627: mpi_dst_example_simple_lap_d_facto0_sched1_kway_svdbegin ................***Timeout 339.99 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2629: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_svdbegin .....***Timeout 339.97 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2630: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_svdend .......***Timeout 339.96 sec Test #2639: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrcpbegin ..............***Timeout 339.99 sec Test #2648: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpend .....***Timeout 339.97 sec Test #2651: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrrtbegin ..............***Timeout 339.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2652: mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrrtend ................***Timeout 339.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2653: mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtbegin ...***Timeout 339.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2656: mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpilu1 ...............***Timeout 339.92 sec ischedInit: The thread number has been automatically set to 256 Test #2657: mpi_dst_example_simple_lap_d_facto1_sched1_not_svdbegin .................***Timeout 339.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.133717e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.363854e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.121847e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.606161e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.933776e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.734858e-01 s Time to initialize coeftab 1.171662e-01 s Time to factorize 3.690517e+00 s ( 1.42 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.3 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.605432e-01 s Test #2661: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_svdbegin .....***Timeout 339.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2664: mpi_dst_example_simple_lap_d_facto1_sched1_not_pqrcpend .................***Timeout 339.93 sec Test #2665: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpbegin ..............***Timeout 339.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2666: mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpend ................***Timeout 339.95 sec Test #2668: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpend .....***Timeout 339.94 sec Test #2672: mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrcpend ................***Timeout 339.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2673: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpbegin ...***Timeout 339.93 sec ischedInit: The thread number has been automatically set to 256 Test #2675: mpi_dst_example_simple_lap_d_facto1_sched1_not_tqrcpbegin ...............***Timeout 339.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.784149e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.158388e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.486326e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.099739e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.945449e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.350530e-01 s Time to initialize coeftab 2.164758e-01 s Time to factorize 3.556944e+00 s ( 1.47 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Time to solve 1.577189e+00 s Test #2676: mpi_dst_example_simple_lap_d_facto1_sched1_not_tqrcpend .................***Timeout 339.95 sec Test #2677: mpi_dst_example_simple_lap_d_facto1_sched1_kway_tqrcpbegin ..............***Timeout 339.94 sec ischedInit: The thread number has been automatically set to 256 Test #2679: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpbegin ...***Timeout 339.93 sec Test #2680: mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpend .....***Timeout 339.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2681: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrrtbegin ...............***Timeout 339.92 sec Test #2682: mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrrtend .................***Timeout 339.91 sec ischedInit: The thread number has been automatically set to 256 Test #2690: mpi_dst_example_simple_lap_d_facto2_sched1_not_svdend ...................***Timeout 339.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2692: mpi_dst_example_simple_lap_d_facto2_sched1_kway_svdend ..................***Timeout 339.92 sec Test #2693: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_svdbegin .....***Timeout 339.92 sec Test #2698: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpend ................***Timeout 339.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2699: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpbegin ...***Timeout 339.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.118304e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.945343e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.488099e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.200995e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.193258e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.299384e-01 s Time to initialize coeftab 1.779584e-01 s Time to factorize 1.178750e+00 s ( 8.47 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 6.222163e-01 s Test #2701: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrcpbegin ...............***Timeout 339.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.121492e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.096704e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.108396e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.906448e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.339256e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.510866e-01 s Time to initialize coeftab 1.986465e-01 s Time to factorize 1.506871e+00 s ( 6.63 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Time to solve 9.466562e-01 s Test #2702: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrcpend .................***Timeout 339.89 sec ischedInit: The thread number has been automatically set to 256 Test #2704: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrcpend ................***Timeout 339.86 sec ischedInit: The thread number has been automatically set to 256 Test #2705: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpbegin ...***Timeout 339.85 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2706: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpend .....***Timeout 339.87 sec Test #2707: mpi_dst_example_simple_lap_d_facto2_sched1_not_tqrcpbegin ...............***Timeout 339.86 sec ischedInit: The thread number has been automatically set to 256 Test #2711: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpbegin ...***Timeout 339.84 sec Test #2713: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrrtbegin ...............***Timeout 339.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2906: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrrtend .................***Timeout 353.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.849657e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.731534e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.512804e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.568834e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.695754e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.000743e-01 s Time to initialize coeftab 1.510031e-01 s Time to factorize 9.518000e+00 s ( 2.13 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 2906: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrrtend Test #2908: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrrtend ................***Timeout 354.79 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.378039e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.889557e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.468462e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.580712e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.872804e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.780914e-01 s Time to initialize coeftab 9.692891e-02 s Time to factorize 7.228531e+00 s ( 2.81 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 2908: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrrtend Test #2909: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtbegin ...***Timeout 354.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.553494e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.018926e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.866700e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.176246e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.537781e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.314383e-02 s Time to initialize coeftab 3.019192e-01 s Time to factorize 9.109522e+00 s ( 2.23 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 2.502126e+00 s - iteration 1 : total iteration time 2.24 s error 4.5999e-13 Time for refinement 3.959236e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.599882e-13 max(|| b_i - A x_i ||_1) 8.894288e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.244331e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.599882e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.599882e-13 max(|| b_i - A x_i ||_1) 8.894288e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.244331e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.894288e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.244331e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.599882e-13 max(|| b_i - A x_i ||_1) 8.894288e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.244331e+00 (SUCCESS) Start 2909: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtbegin Test #2911: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpilu0 ...............***Timeout 355.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.538157e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.187560e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.138388e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.297522e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.591968e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.945175e-02 s Time to initialize coeftab 8.601855e-02 s Time to factorize 1.795394e+01 s ( 1.13 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.393824e-01 s - iteration 1 : total iteration time 0.513 s error 1.3953e-14 Time for refinement 1.526099e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.395481e-14 max(|| b_i - A x_i ||_1) 2.164653e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.462153e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.395481e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.395481e-14 max(|| b_i - A x_i ||_1) 2.164653e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.462153e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.395481e-14 max(|| b_i - A x_i ||_1) 2.164653e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.462153e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.164653e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.462153e-02 (SUCCESS) Start 2911: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpilu0 2746/3626 Test #3065: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrrtbegin ...............***Timeout 358.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.145951e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.021393e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.884152e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.409447e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.253234e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.454242e-01 s Time to initialize coeftab 3.028828e-01 s Time to factorize 2.710102e+00 s ( 1.87 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.347675e+00 s - iteration 1 : total iteration time 1.73 s error 6.1748e-11 Time for refinement 3.515543e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.995589e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.995589e-08 max(|| b_i - A x_i ||_1) 2.935670e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.688935e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.995589e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.995589e-08 max(|| b_i - A x_i ||_1) 2.935670e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.688935e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.935670e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.688935e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.935670e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.688935e-01 (SUCCESS) Start 3065: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrrtbegin 2746/3626 Test #3066: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrrtend .................***Timeout 358.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.287427e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.741701e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.229525e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.250423e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.566141e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.602440e-01 s Time to initialize coeftab 1.458818e-01 s Time to factorize 2.197080e+00 s ( 2.30 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.377648e+00 s Time for refinement 8.856171e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.257999e-07 max(|| b_i - A x_i ||_1) 9.810029e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.232719e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.257999e-07 max(|| b_i - A x_i ||_1) 9.810029e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.232719e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.257999e-07 max(|| b_i - A x_i ||_1) 9.810029e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.232719e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.257999e-07 max(|| b_i - A x_i ||_1) 9.810029e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.232719e+00 (SUCCESS) Start 3066: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrrtend 2746/3626 Test #3067: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrrtbegin ..............***Timeout 358.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.949467e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.249617e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.398611e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.507334e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.331722e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.275033e-01 s Time to initialize coeftab 5.702063e-01 s Time to factorize 1.123108e+01 s (461.56 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko Start 3067: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrrtbegin 2746/3626 Test #3068: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrrtend ................***Timeout 358.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.101134e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.947265e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.109190e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.236955e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.256267e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.073570e-01 s Time to initialize coeftab 2.982125e-01 s Time to factorize 7.996200e-01 s ( 6.33 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.647965e-01 s Time for refinement 1.832088e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.484083e-07 max(|| b_i - A x_i ||_1) 1.239021e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.556942e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.484083e-07 max(|| b_i - A x_i ||_1) 1.239021e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.556942e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.484083e-07 max(|| b_i - A x_i ||_1) 1.239021e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.556942e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.484083e-07 max(|| b_i - A x_i ||_1) 1.239021e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.556942e+00 (SUCCESS) Start 3068: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrrtend 2746/3626 Test #3069: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtbegin ...***Timeout 358.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.255400e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.385116e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.157920e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.932954e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.362942e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.034144e-01 s Time to initialize coeftab 1.628250e+00 s Time to factorize 3.615578e+00 s ( 1.40 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.515745e+00 s - iteration 1 : total iteration time 1.14 s error 6.0269e-11 Time for refinement 2.495365e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.861916e-08 max(|| b_i - A x_i ||_1) 2.935776e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.689068e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.861916e-08 max(|| b_i - A x_i ||_1) 2.935776e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.689068e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.861916e-08 max(|| b_i - A x_i ||_1) 2.935776e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.689068e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.861916e-08 max(|| b_i - A x_i ||_1) 2.935776e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.689068e-01 (SUCCESS) Start 3069: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtbegin 2746/3626 Test #3070: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtend .....***Timeout 358.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.047178e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.581588e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.211375e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.230396e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.090314e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.074557e-01 s Time to initialize coeftab 9.401132e-02 s Time to factorize 1.029064e+00 s ( 4.92 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.546830e-01 s - iteration 1 : total iteration time 1.17 s error 2.1822e-11 Time for refinement 2.237506e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.780110e-08 max(|| b_i - A x_i ||_1) 2.838953e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.567400e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.780110e-08 max(|| b_i - A x_i ||_1) 2.838953e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.567400e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.780110e-08 max(|| b_i - A x_i ||_1) 2.838953e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.567400e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.780110e-08 max(|| b_i - A x_i ||_1) 2.838953e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.567400e-01 (SUCCESS) Start 3070: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtend 2746/3626 Test #3071: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpilu0 ...............***Timeout 358.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.608988e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.555051e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.568260e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.231750e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.898160e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.557194e-01 s Time to initialize coeftab 1.129727e-01 s Time to factorize 2.306151e+00 s ( 2.20 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.161345e-01 s - iteration 1 : total iteration time 0.801 s error 3.0916e-11 Time for refinement 1.723557e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.881448e-08 max(|| b_i - A x_i ||_1) 2.885004e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.625267e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.881448e-08 max(|| b_i - A x_i ||_1) 2.885004e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.625267e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.881448e-08 max(|| b_i - A x_i ||_1) 2.885004e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.625267e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.881448e-08 max(|| b_i - A x_i ||_1) 2.885004e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.625267e-01 (SUCCESS) Start 3071: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpilu0 2746/3626 Test #3072: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpilu1 ...............***Timeout 358.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.203926e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.008332e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.179102e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.798988e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.800894e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.745131e-01 s Time to initialize coeftab 1.921345e-01 s Time to factorize 8.581015e+00 s (604.10 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Start 3072: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpilu1 2746/3626 Test #3073: mpi_dst_example_simple_lap_s_facto1_sched4_not_svdbegin .................***Timeout 358.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.761005e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.861710e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.536166e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.374045e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.370319e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.345909e-01 s Time to initialize coeftab 3.797522e-01 s Time to factorize 1.298222e+01 s (412.80 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.747975e+00 s Time for refinement 1.945080e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.944151e-07 max(|| b_i - A x_i ||_1) 8.516301e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.070150e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.944151e-07 max(|| b_i - A x_i ||_1) 8.516301e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.070150e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.944151e-07 max(|| b_i - A x_i ||_1) 8.516301e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.070150e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.944151e-07 max(|| b_i - A x_i ||_1) 8.516301e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.070150e+00 (SUCCESS) Start 3073: mpi_dst_example_simple_lap_s_facto1_sched4_not_svdbegin 2746/3626 Test #3074: mpi_dst_example_simple_lap_s_facto1_sched4_not_svdend ...................***Timeout 358.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.097987e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.097124e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.020979e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.580074e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.240284e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.021455e-01 s Time to initialize coeftab 2.081011e-01 s Time to factorize 2.108516e+00 s ( 2.48 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.933827e-01 s Time for refinement 1.387140e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.729345e-07 max(|| b_i - A x_i ||_1) 7.576965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.521142e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.729345e-07 max(|| b_i - A x_i ||_1) 7.576965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.521142e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.729345e-07 max(|| b_i - A x_i ||_1) 7.576965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.521142e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.729345e-07 max(|| b_i - A x_i ||_1) 7.576965e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.521142e-01 (SUCCESS) Start 3074: mpi_dst_example_simple_lap_s_facto1_sched4_not_svdend 2746/3626 Test #3081: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpbegin ..............***Timeout 361.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.062942e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.160319e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.172704e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.770116e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.161251e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.001469e-01 s Time to initialize coeftab 1.346052e-01 s Time to factorize 3.238213e+00 s ( 1.62 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.057117e+00 s - iteration 1 : total iteration time 1.27 s error 4.8627e-11 Time for refinement 2.659485e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.920955e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.920955e-08 max(|| b_i - A x_i ||_1) 2.880209e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.619243e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.920955e-08 max(|| b_i - A x_i ||_1) 2.880209e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.619243e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.920955e-08 max(|| b_i - A x_i ||_1) 2.880209e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.619243e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.880209e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.619243e-01 (SUCCESS) Start 3081: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpbegin 2746/3626 Test #3082: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpend ................***Timeout 361.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.371305e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.141521e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.635749e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.098518e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.065537e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.027088e-01 s Time to initialize coeftab 9.409231e-02 s Time to factorize 6.782351e-01 s ( 7.72 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.758980e+00 s Time for refinement 1.204238e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.214802e-07 max(|| b_i - A x_i ||_1) 1.202968e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.511638e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.214802e-07 max(|| b_i - A x_i ||_1) 1.202968e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.511638e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.214802e-07 max(|| b_i - A x_i ||_1) 1.202968e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.511638e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.214802e-07 max(|| b_i - A x_i ||_1) 1.202968e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.511638e+00 (SUCCESS) Start 3082: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpend 2746/3626 Test #3083: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpbegin ...***Timeout 361.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.451989e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.015759e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.908413e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.098338e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.741149e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.241506e-01 s Time to initialize coeftab 1.839905e-01 s Time to factorize 4.018040e+00 s ( 1.30 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.493032e+00 s - iteration 1 : total iteration time 2.37 s error 2.5159e-11 Time for refinement 4.943261e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.702993e-08 max(|| b_i - A x_i ||_1) 2.861223e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.595386e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.702993e-08 max(|| b_i - A x_i ||_1) 2.861223e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.595386e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.702993e-08 max(|| b_i - A x_i ||_1) 2.861223e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.595386e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.702993e-08 max(|| b_i - A x_i ||_1) 2.861223e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.595386e-01 (SUCCESS) Start 3083: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpbegin 2746/3626 Test #3084: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpend .....***Timeout 361.87 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.708877e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.516672e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.130695e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.051650e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.282462e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.965834e-01 s Time to initialize coeftab 1.188738e-01 s Time to factorize 1.133910e+00 s ( 4.62 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.377775e+00 s Time for refinement 6.413251e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.625746e-07 max(|| b_i - A x_i ||_1) 1.068310e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.342427e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.625746e-07 max(|| b_i - A x_i ||_1) 1.068310e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.342427e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.625746e-07 max(|| b_i - A x_i ||_1) 1.068310e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.342427e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.625746e-07 max(|| b_i - A x_i ||_1) 1.068310e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.342427e+00 (SUCCESS) Start 3084: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpend 2746/3626 Test #3086: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrcpend .................***Timeout 362.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.121297e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.386379e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.540292e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.149394e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.772046e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.152003e-01 s Time to initialize coeftab 9.955071e-02 s Time to factorize 7.549407e-01 s ( 6.93 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.694279e+00 s Time for refinement 1.120973e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.047285e-07 max(|| b_i - A x_i ||_1) 1.169468e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.469542e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.047285e-07 max(|| b_i - A x_i ||_1) 1.169468e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.469542e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.047285e-07 max(|| b_i - A x_i ||_1) 1.169468e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.469542e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.047285e-07 max(|| b_i - A x_i ||_1) 1.169468e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.469542e+00 (SUCCESS) Start 3086: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrcpend 2746/3626 Test #3087: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrcpbegin ..............***Timeout 362.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.590199e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.643300e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.821261e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.168016e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.906815e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.084559e+00 s Time to initialize coeftab 4.943630e-01 s Time to factorize 1.142862e+01 s (468.92 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 2.417652e+00 s - iteration 1 : total iteration time 2.1 s error 6.1151e-11 Time for refinement 4.199074e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.250410e-08 max(|| b_i - A x_i ||_1) 2.920595e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.669991e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.250410e-08 max(|| b_i - A x_i ||_1) 2.920595e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.669991e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.250410e-08 max(|| b_i - A x_i ||_1) 2.920595e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.669991e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.250410e-08 max(|| b_i - A x_i ||_1) 2.920595e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.669991e-01 (SUCCESS) Start 3087: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrcpbegin 2746/3626 Test #3088: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrcpend ................***Timeout 362.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.632337e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.827042e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.604123e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.631046e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.852494e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.139558e-01 s Time to initialize coeftab 8.640048e-02 s Time to factorize 7.280418e+00 s (736.09 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.061394e+00 s Time for refinement 1.831820e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.035970e-07 max(|| b_i - A x_i ||_1) 8.567818e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076624e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.035970e-07 max(|| b_i - A x_i ||_1) 8.567818e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076624e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.035970e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.035970e-07 max(|| b_i - A x_i ||_1) 8.567818e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076624e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.567818e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.076624e+00 (SUCCESS) Start 3088: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrcpend 2746/3626 Test #3089: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpbegin ...***Timeout 362.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.463222e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.711009e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.303419e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.917095e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.909005e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.624219e-01 s Time to initialize coeftab 6.197104e-01 s Time to factorize 4.851608e+00 s ( 1.08 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 1.080553e+00 s - iteration 1 : total iteration time 1.24 s error 4.6385e-11 Time for refinement 3.577100e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.910044e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.910044e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.910044e-08 max(|| b_i - A x_i ||_1) 2.851911e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.583684e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.910044e-08 max(|| b_i - A x_i ||_1) 2.851911e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.583684e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.851911e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.583684e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.851911e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.583684e-01 (SUCCESS) Start 3089: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpbegin 2746/3626 Test #3090: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpend .....***Timeout 362.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.409141e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.239118e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.873612e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.768859e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.244762e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.082290e-01 s Time to initialize coeftab 1.739913e-01 s Time to factorize 1.576719e+00 s ( 3.32 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.978196e+00 s Time for refinement 1.444921e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.053186e-07 max(|| b_i - A x_i ||_1) 1.147270e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.441648e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.053186e-07 max(|| b_i - A x_i ||_1) 1.147270e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.441648e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.053186e-07 max(|| b_i - A x_i ||_1) 1.147270e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.441648e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.053186e-07 max(|| b_i - A x_i ||_1) 1.147270e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.441648e+00 (SUCCESS) Start 3090: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpend 2746/3626 Test #3092: mpi_dst_example_simple_lap_s_facto1_sched4_not_tqrcpend .................***Timeout 362.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.016057e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.116726e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.060669e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.615637e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.229956e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.574741e-01 s Time to initialize coeftab 3.747507e-02 s Time to factorize 1.010091e+00 s ( 5.18 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.501344e-01 s Time for refinement 5.094552e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.703483e-07 max(|| b_i - A x_i ||_1) 8.899439e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.118295e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.703483e-07 max(|| b_i - A x_i ||_1) 8.899439e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.118295e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.703483e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.703483e-07 max(|| b_i - A x_i ||_1) 8.899439e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.118295e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.899439e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.118295e+00 (SUCCESS) Start 3092: mpi_dst_example_simple_lap_s_facto1_sched4_not_tqrcpend 2746/3626 Test #3093: mpi_dst_example_simple_lap_s_facto1_sched4_kway_tqrcpbegin ..............***Timeout 362.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.572103e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.523299e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.202184e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.310714e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.021613e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.619454e-01 s Time to initialize coeftab 3.414983e-01 s Time to factorize 3.295453e+00 s ( 1.59 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 5.691318e-01 s - iteration 1 : total iteration time 1.1 s error 3.7348e-11 Time for refinement 2.001433e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.743584e-08 max(|| b_i - A x_i ||_1) 2.819397e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.542827e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.743584e-08 max(|| b_i - A x_i ||_1) 2.819397e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.542827e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.743584e-08 max(|| b_i - A x_i ||_1) 2.819397e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.542827e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.743584e-08 max(|| b_i - A x_i ||_1) 2.819397e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.542827e-01 (SUCCESS) Start 3093: mpi_dst_example_simple_lap_s_facto1_sched4_kway_tqrcpbegin 2746/3626 Test #3094: mpi_dst_example_simple_lap_s_facto1_sched4_kway_tqrcpend ................***Timeout 362.92 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.935319e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.510017e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.276006e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.823863e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.080028e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.290574e-01 s Time to initialize coeftab 2.420032e-01 s Time to factorize 8.327992e+00 s (643.50 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.315043e+00 s Time for refinement 1.301518e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.588260e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.588260e-07 max(|| b_i - A x_i ||_1) 8.749362e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.099436e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.588260e-07 max(|| b_i - A x_i ||_1) 8.749362e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.099436e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.749362e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.099436e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.588260e-07 max(|| b_i - A x_i ||_1) 8.749362e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.099436e+00 (SUCCESS) Start 3094: mpi_dst_example_simple_lap_s_facto1_sched4_kway_tqrcpend 2746/3626 Test #3096: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpend .....***Timeout 363.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.359477e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.217611e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.000689e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.981158e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.399427e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.043346e-01 s Time to initialize coeftab 1.822786e-01 s Time to factorize 1.102390e+01 s (486.13 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Start 3096: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpend 2746/3626 Test #3097: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrrtbegin ...............***Timeout 363.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.753634e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.850679e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.060136e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.978258e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.053908e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.225823e+00 s Time to initialize coeftab 2.851403e+00 s Time to factorize 4.559001e+00 s ( 1.15 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.090848e+00 s - iteration 1 : total iteration time 0.834 s error 3.3207e-11 Time for refinement 1.853210e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.992678e-08 max(|| b_i - A x_i ||_1) 2.929231e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.680843e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.992678e-08 max(|| b_i - A x_i ||_1) 2.929231e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.680843e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.992678e-08 max(|| b_i - A x_i ||_1) 2.929231e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.680843e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.992678e-08 max(|| b_i - A x_i ||_1) 2.929231e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.680843e-01 (SUCCESS) Start 3097: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrrtbegin 2746/3626 Test #3100: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrrtend ................***Timeout 362.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.177440e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.209933e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.038770e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.465610e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.136646e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.186564e+00 s Time to initialize coeftab 1.101865e+00 s Time to factorize 5.912231e+00 s (906.44 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.144470e+00 s Time for refinement 9.193585e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.732111e-07 max(|| b_i - A x_i ||_1) 1.113836e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.399636e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.732111e-07 max(|| b_i - A x_i ||_1) 1.113836e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.399636e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.732111e-07 max(|| b_i - A x_i ||_1) 1.113836e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.399636e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.732111e-07 max(|| b_i - A x_i ||_1) 1.113836e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.399636e+00 (SUCCESS) Start 3100: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrrtend 2746/3626 Test #3102: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtend .....***Timeout 362.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.035743e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.938732e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.019466e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.494364e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.075894e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.720873e-02 s Time to initialize coeftab 4.698273e-02 s Time to factorize 1.269334e+00 s ( 4.12 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 7.711201e-01 s Time for refinement 3.939393e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.764223e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.764223e-07 max(|| b_i - A x_i ||_1) 1.157210e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.454139e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.764223e-07 max(|| b_i - A x_i ||_1) 1.157210e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.454139e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.764223e-07 max(|| b_i - A x_i ||_1) 1.157210e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.454139e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.157210e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.454139e+00 (SUCCESS) Start 3102: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtend 2746/3626 Test #3103: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpilu0 ...............***Timeout 362.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.302827e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.694345e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.994086e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.466404e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.107781e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.879601e-01 s Time to initialize coeftab 1.594378e-01 s Time to factorize 2.026665e+00 s ( 2.58 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.004149e+00 s - iteration 1 : total iteration time 2.38 s error 3.0575e-11 Time for refinement 5.047532e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.652079e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.652079e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.652079e-08 max(|| b_i - A x_i ||_1) 2.756140e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.463339e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.756140e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.463339e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.652079e-08 max(|| b_i - A x_i ||_1) 2.756140e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.463339e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.756140e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.463339e-01 (SUCCESS) Start 3103: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpilu0 2746/3626 Test #3104: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpilu1 ...............***Timeout 362.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.346234e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.826555e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.630498e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.777505e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.604398e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.274600e-01 s Time to initialize coeftab 1.230270e-01 s Time to factorize 3.941318e+00 s ( 1.33 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.085400e+00 s - iteration 1 : total iteration time 1.16 s error 4.075e-11 Time for refinement 2.646037e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.704870e-08 max(|| b_i - A x_i ||_1) 2.814236e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.536341e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.704870e-08 max(|| b_i - A x_i ||_1) 2.814236e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.536341e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.704870e-08 max(|| b_i - A x_i ||_1) 2.814236e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.536341e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.704870e-08 max(|| b_i - A x_i ||_1) 2.814236e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.536341e-01 (SUCCESS) Start 3104: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpilu1 2746/3626 Test #3105: mpi_dst_example_simple_lap_s_facto2_sched4_not_svdbegin .................***Timeout 362.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.037895e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.534704e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.537499e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.083959e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.477500e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.784338e-02 s Time to initialize coeftab 2.384005e-01 s Time to factorize 5.300981e+00 s ( 1.88 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 9.086390e-01 s Time for refinement 1.012321e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.001351e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.001351e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.001351e-07 max(|| b_i - A x_i ||_1) 8.672932e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.089832e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.672932e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.089832e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.672932e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.089832e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.001351e-07 max(|| b_i - A x_i ||_1) 8.672932e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.089832e+00 (SUCCESS) Start 3105: mpi_dst_example_simple_lap_s_facto2_sched4_not_svdbegin 2746/3626 Test #3106: mpi_dst_example_simple_lap_s_facto2_sched4_not_svdend ...................***Timeout 362.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.618195e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.935145e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.396673e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.074328e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.468464e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.796489e-01 s Time to initialize coeftab 2.877045e+00 s Time to factorize 5.309067e+00 s ( 1.88 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 8.339014e-01 s Time for refinement 5.477294e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.700026e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.700026e-07 max(|| b_i - A x_i ||_1) 7.388795e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.284689e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.388795e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.284689e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.700026e-07 max(|| b_i - A x_i ||_1) 7.388795e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.284689e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.700026e-07 max(|| b_i - A x_i ||_1) 7.388795e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.284689e-01 (SUCCESS) Start 3106: mpi_dst_example_simple_lap_s_facto2_sched4_not_svdend 2746/3626 Test #3107: mpi_dst_example_simple_lap_s_facto2_sched4_kway_svdbegin ................***Timeout 362.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.881405e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.547506e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.262683e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.756782e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.238094e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.907131e-01 s Time to initialize coeftab 3.430333e-01 s Time to factorize 3.613237e+00 s ( 2.76 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.3 Ko / 88.6 Ko ------------------------------------------------ Total 112 Ko / 113 Ko Time to solve 9.843371e-01 s Time for refinement 9.093330e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.960263e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.960263e-07 max(|| b_i - A x_i ||_1) 8.514014e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069863e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.514014e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069863e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.960263e-07 max(|| b_i - A x_i ||_1) 8.514014e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069863e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.960263e-07 max(|| b_i - A x_i ||_1) 8.514014e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.069863e+00 (SUCCESS) Start 3107: mpi_dst_example_simple_lap_s_facto2_sched4_kway_svdbegin 2746/3626 Test #3108: mpi_dst_example_simple_lap_s_facto2_sched4_kway_svdend ..................***Timeout 362.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.487729e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.168766e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.274987e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.525060e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.115389e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.451411e-01 s Time to initialize coeftab 1.085155e-01 s Time to factorize 6.005103e+00 s ( 1.66 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.948806e+00 s Time for refinement 5.747789e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.722872e-07 max(|| b_i - A x_i ||_1) 7.525126e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.456001e-01 (SUCCESS) || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.722872e-07 max(|| b_i - A x_i ||_1) 7.525126e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.456001e-01 (SUCCESS) || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.722872e-07 max(|| b_i - A x_i ||_1) 7.525126e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.456001e-01 (SUCCESS) || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.722872e-07 max(|| b_i - A x_i ||_1) 7.525126e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.456001e-01 (SUCCESS) Start 3108: mpi_dst_example_simple_lap_s_facto2_sched4_kway_svdend 2746/3626 Test #3112: mpi_dst_example_simple_lap_s_facto2_sched4_not_pqrcpend .................***Timeout 363.76 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.140821e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.474194e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.173535e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.158456e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.266963e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.529555e+00 s Time to initialize coeftab 4.410690e-01 s Time to factorize 5.758363e+00 s ( 1.73 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 3112: mpi_dst_example_simple_lap_s_facto2_sched4_not_pqrcpend 2746/3626 Test #3114: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpend ................***Timeout 364.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.781732e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.904916e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.099502e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.690518e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.670737e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.939833e-01 s Time to initialize coeftab 1.221550e-01 s Time to factorize 1.808404e+00 s ( 5.52 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 8.703865e-01 s - iteration 1 : total iteration time 0.723 s error 6.476e-11 Time for refinement 1.850184e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.653538e-08 max(|| b_i - A x_i ||_1) 2.755858e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.462985e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.653538e-08 max(|| b_i - A x_i ||_1) 2.755858e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.462985e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.653538e-08 max(|| b_i - A x_i ||_1) 2.755858e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.462985e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.653538e-08 max(|| b_i - A x_i ||_1) 2.755858e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.462985e-01 (SUCCESS) Start 3114: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpend 2746/3626 Test #3115: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpbegin ...***Timeout 364.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.125653e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.944879e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.407019e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.572829e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.161976e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.886927e-01 s Time to initialize coeftab 3.075977e+00 s Time to factorize 2.978940e+00 s ( 3.35 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.053107e+00 s - iteration 1 : total iteration time 1.39 s error 3.7362e-11 Time for refinement 2.969068e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.162486e-08 max(|| b_i - A x_i ||_1) 2.956790e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.715474e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.162486e-08 max(|| b_i - A x_i ||_1) 2.956790e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.715474e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.162486e-08 max(|| b_i - A x_i ||_1) 2.956790e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.715474e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.162486e-08 max(|| b_i - A x_i ||_1) 2.956790e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.715474e-01 (SUCCESS) Start 3115: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpbegin 2746/3626 Test #3116: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpend .....***Timeout 363.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.817012e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.010472e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.090111e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.944936e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.013330e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.691349e-01 s Time to initialize coeftab 1.366234e-01 s Time to factorize 3.637280e+00 s ( 2.75 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.226637e+00 s - iteration 1 : total iteration time 1.17 s error 7.743e-13 Time for refinement 2.486349e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.626470e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.626470e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.626470e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.626470e-08 max(|| b_i - A x_i ||_1) 2.735806e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.437787e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.735806e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.437787e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.735806e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.437787e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.735806e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.437787e-01 (SUCCESS) Start 3116: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpend 2746/3626 Test #3118: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrcpend .................***Timeout 363.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.209843e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.238893e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.590589e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.735036e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.069460e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.032503e+00 s Time to initialize coeftab 2.943407e-01 s Time to factorize 4.107695e+00 s ( 2.43 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 3118: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrcpend 2746/3626 Test #3120: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrcpend ................***Timeout 363.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.432512e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.290279e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.201913e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.932925e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.542786e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.814961e-01 s Time to initialize coeftab 1.811829e-01 s Time to factorize 2.689634e+00 s ( 3.71 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.348389e+00 s - iteration 1 : total iteration time 3.18 s error 9.8694e-13 Time for refinement 5.833100e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.488325e-08 max(|| b_i - A x_i ||_1) 2.672869e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.358701e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.488325e-08 max(|| b_i - A x_i ||_1) 2.672869e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.358701e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.488325e-08 max(|| b_i - A x_i ||_1) 2.672869e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.358701e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.488325e-08 max(|| b_i - A x_i ||_1) 2.672869e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.358701e-01 (SUCCESS) Start 3120: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrcpend 2746/3626 Test #3122: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpend .....***Timeout 364.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.193683e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.070167e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.452101e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.581515e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.588895e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.633278e+00 s Time to initialize coeftab 2.443043e-01 s Time to factorize 6.686795e+00 s ( 1.49 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Start 3122: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpend 2746/3626 Test #3124: mpi_dst_example_simple_lap_s_facto2_sched4_not_tqrcpend .................***Timeout 359.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.701558e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.108818e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.174366e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.585169e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.179207e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.073776e+00 s Time to initialize coeftab 2.239710e-01 s Time to factorize 6.336247e+00 s ( 1.58 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 3124: mpi_dst_example_simple_lap_s_facto2_sched4_not_tqrcpend 2746/3626 Test #3125: mpi_dst_example_simple_lap_s_facto2_sched4_kway_tqrcpbegin ..............***Timeout 359.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.906961e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.471955e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.977240e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.391429e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.291745e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.841944e-01 s Time to initialize coeftab 8.016697e-01 s Time to factorize 1.424369e+01 s (717.81 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 112 Ko / 113 Ko Start 3125: mpi_dst_example_simple_lap_s_facto2_sched4_kway_tqrcpbegin 2746/3626 Test #3128: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpend .....***Timeout 361.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.773653e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.441914e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.942866e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.570165e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.209120e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.321470e-01 s Time to initialize coeftab 2.001549e-01 s Time to factorize 1.998008e+00 s ( 5.00 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 8.662790e-01 s - iteration 1 : total iteration time 1.57 s error 1.2078e-12 Time for refinement 3.393122e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.452397e-08 max(|| b_i - A x_i ||_1) 2.739099e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.441926e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.452397e-08 max(|| b_i - A x_i ||_1) 2.739099e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.441926e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.452397e-08 max(|| b_i - A x_i ||_1) 2.739099e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.441926e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.452397e-08 max(|| b_i - A x_i ||_1) 2.739099e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.441926e-01 (SUCCESS) Start 3128: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpend 2746/3626 Test #3131: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrrtbegin ..............***Timeout 362.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.346088e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.516275e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.186274e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.352015e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.478841e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.875480e-01 s Time to initialize coeftab 6.122966e-01 s Time to factorize 1.560542e+01 s (655.17 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Start 3131: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrrtbegin 2746/3626 Test #3132: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrrtend ................***Timeout 362.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.759331e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.242186e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.045160e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.674679e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.285742e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.663498e+00 s Time to initialize coeftab 3.441094e-01 s Time to factorize 3.722743e+00 s ( 2.68 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Start 3132: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrrtend 2746/3626 Test #3133: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtbegin ...***Timeout 362.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.157664e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.073811e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.089075e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.883265e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.467346e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.441171e+00 s Time to initialize coeftab 1.322982e+00 s Time to factorize 3.006271e+00 s ( 3.32 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 7.989157e-01 s - iteration 1 : total iteration time 1.24 s error 7.3136e-11 Time for refinement 3.149708e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.933466e-08 max(|| b_i - A x_i ||_1) 2.898126e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.641757e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.933466e-08 max(|| b_i - A x_i ||_1) 2.898126e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.641757e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.933466e-08 max(|| b_i - A x_i ||_1) 2.898126e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.641757e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.933466e-08 max(|| b_i - A x_i ||_1) 2.898126e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.641757e-01 (SUCCESS) Start 3133: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtbegin 2746/3626 Test #3134: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtend .....***Timeout 362.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.994079e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.489405e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.005208e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.437459e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.198937e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.104222e-01 s Time to initialize coeftab 1.361931e-01 s Time to factorize 4.432958e+00 s ( 2.25 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.874920e+00 s - iteration 1 : total iteration time 2.1 s error 3.7782e-12 Time for refinement 4.395776e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.576994e-08 max(|| b_i - A x_i ||_1) 2.714202e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.410640e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.576994e-08 max(|| b_i - A x_i ||_1) 2.714202e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.410640e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.576994e-08 max(|| b_i - A x_i ||_1) 2.714202e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.410640e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.576994e-08 max(|| b_i - A x_i ||_1) 2.714202e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.410640e-01 (SUCCESS) Start 3134: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtend 2746/3626 Test #3135: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpilu0 ...............***Timeout 362.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.298479e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.776231e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.322137e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.676162e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.764642e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.038121e+00 s Time to initialize coeftab 2.741740e-01 s Time to factorize 3.519931e+00 s ( 2.84 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.554224e+00 s - iteration 1 : total iteration time 2.51 s error 1.4891e-11 Time for refinement 5.487779e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.083948e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.083948e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.083948e-08 max(|| b_i - A x_i ||_1) 2.971408e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.733842e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.971408e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.733842e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.971408e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.733842e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.083948e-08 max(|| b_i - A x_i ||_1) 2.971408e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.733842e-01 (SUCCESS) Start 3135: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpilu0 2746/3626 Test #3136: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpilu1 ...............***Timeout 362.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.148343e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.987853e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.238351e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.841783e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.062578e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.194788e+00 s Time to initialize coeftab 2.970436e-01 s Time to factorize 5.905991e+00 s ( 1.69 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Start 3136: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpilu1 2746/3626 Test #3138: mpi_dst_example_simple_lap_d_facto0_sched4_not_svdend ...................***Timeout 362.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.887895e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.084873e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.744519e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.843511e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.700953e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.083414e-01 s Time to initialize coeftab 3.034403e-01 s Time to factorize 1.008032e+01 s (514.25 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.571162e+00 s - iteration 1 : total iteration time 1.87 s error 3.4375e-15 Time for refinement 4.985573e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.437812e-15 max(|| b_i - A x_i ||_1) 3.356309e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.217491e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.437812e-15 max(|| b_i - A x_i ||_1) 3.356309e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.217491e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.437812e-15 max(|| b_i - A x_i ||_1) 3.356309e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.217491e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.437812e-15 max(|| b_i - A x_i ||_1) 3.356309e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.217491e-03 (SUCCESS) Start 3138: mpi_dst_example_simple_lap_d_facto0_sched4_not_svdend 2746/3626 Test #3143: mpi_dst_example_simple_lap_d_facto0_sched4_not_pqrcpbegin ...............***Timeout 362.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.220947e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.807999e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.875568e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.303006e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.415867e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.082508e+00 s Time to initialize coeftab 1.430504e-01 s Time to factorize 2.712139e+00 s ( 1.87 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.184176e-01 s - iteration 1 : total iteration time 1.58 s error 1.5608e-14 Time for refinement 2.733760e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.560346e-14 max(|| b_i - A x_i ||_1) 2.843652e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.573293e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.560346e-14 max(|| b_i - A x_i ||_1) 2.843652e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.573293e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.560346e-14 max(|| b_i - A x_i ||_1) 2.843652e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.573293e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.560346e-14 max(|| b_i - A x_i ||_1) 2.843652e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.573293e-02 (SUCCESS) Start 3143: mpi_dst_example_simple_lap_d_facto0_sched4_not_pqrcpbegin 2746/3626 Test #3144: mpi_dst_example_simple_lap_d_facto0_sched4_not_pqrcpend .................***Timeout 362.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.081076e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.146345e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.849017e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.604250e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.098746e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.161026e+00 s Time to initialize coeftab 9.407939e-02 s Time to factorize 5.537134e+00 s (936.19 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.230018e+00 s - iteration 1 : total iteration time 4.02 s error 1.2804e-15 Time for refinement 6.360509e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.293217e-15 max(|| b_i - A x_i ||_1) 1.571796e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.975097e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.293217e-15 max(|| b_i - A x_i ||_1) 1.571796e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.975097e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.293217e-15 max(|| b_i - A x_i ||_1) 1.571796e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.975097e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.293217e-15 max(|| b_i - A x_i ||_1) 1.571796e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.975097e-03 (SUCCESS) Start 3144: mpi_dst_example_simple_lap_d_facto0_sched4_not_pqrcpend 2746/3626 Test #3147: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpbegin ...***Timeout 362.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.262020e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.066013e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.031156e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.504007e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.410340e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.509992e+00 s Time to initialize coeftab 4.658377e-01 s Time to factorize 1.660900e+01 s (312.11 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3147: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpbegin 2746/3626 Test #3149: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrcpbegin ...............***Timeout 363.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.593470e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.257116e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.464862e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.802101e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.110904e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.439394e+00 s Time to initialize coeftab 4.335644e-01 s Time to factorize 3.822993e+00 s ( 1.32 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.522991e+00 s - iteration 1 : total iteration time 2.42 s error 5.5625e-14 Time for refinement 4.800269e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.562881e-14 max(|| b_i - A x_i ||_1) 1.096833e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.378265e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.562881e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.562881e-14 max(|| b_i - A x_i ||_1) 1.096833e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.378265e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 1.096833e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.378265e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.562881e-14 max(|| b_i - A x_i ||_1) 1.096833e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.378265e-01 (SUCCESS) Start 3149: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrcpbegin 2746/3626 Test #3150: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrcpend .................***Timeout 363.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.366675e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.263739e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.118849e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.097286e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.964713e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.774750e-01 s Time to initialize coeftab 6.692204e-02 s Time to factorize 2.034205e+00 s ( 2.49 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.305618e+00 s - iteration 1 : total iteration time 1.93 s error 1.2257e-15 Time for refinement 4.125339e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.225826e-15 max(|| b_i - A x_i ||_1) 1.684415e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.116613e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.225826e-15 max(|| b_i - A x_i ||_1) 1.684415e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.116613e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.225826e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.225826e-15 max(|| b_i - A x_i ||_1) 1.684415e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.116613e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.684415e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.116613e-03 (SUCCESS) Start 3150: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrcpend 2746/3626 Test #3152: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrcpend ................***Timeout 363.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.165609e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.398952e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.580299e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.278578e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.153955e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.353943e-01 s Time to initialize coeftab 8.931683e-02 s Time to factorize 1.665828e+00 s ( 3.04 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.443623e+00 s - iteration 1 : total iteration time 5.5 s error 1.4925e-15 Time for refinement 7.813475e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.497253e-15 max(|| b_i - A x_i ||_1) 2.017440e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.535087e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.497253e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.497253e-15 max(|| b_i - A x_i ||_1) 2.017440e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.535087e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.017440e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.535087e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.497253e-15 max(|| b_i - A x_i ||_1) 2.017440e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.535087e-03 (SUCCESS) Start 3152: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrcpend 2746/3626 Test #3158: mpi_dst_example_simple_lap_d_facto0_sched4_kway_tqrcpend ................***Timeout 364.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.496858e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.107375e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.516717e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.288129e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.023823e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.385058e-01 s Time to initialize coeftab 1.748371e-01 s Time to factorize 4.235958e+00 s ( 1.20 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.398132e+00 s - iteration 1 : total iteration time 5.02 s error 2.7847e-15 Time for refinement 8.235790e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.781264e-15 max(|| b_i - A x_i ||_1) 2.303861e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.895000e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.781264e-15 max(|| b_i - A x_i ||_1) 2.303861e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.895000e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.781264e-15 max(|| b_i - A x_i ||_1) 2.303861e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.895000e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.781264e-15 max(|| b_i - A x_i ||_1) 2.303861e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.895000e-03 (SUCCESS) Start 3158: mpi_dst_example_simple_lap_d_facto0_sched4_kway_tqrcpend Test #2922: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpend ................***Timeout 262.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.650297e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.817078e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.742642e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.862524e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.129705e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.504299e-02 s Time to initialize coeftab 4.569757e-02 s Time to factorize 7.969967e+00 s ( 2.67 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 5.065664e-01 s - iteration 1 : total iteration time 0.889 s error 2.4976e-15 Time for refinement 1.920326e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.498817e-15 max(|| b_i - A x_i ||_1) 1.962239e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.951395e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.498817e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.498817e-15 max(|| b_i - A x_i ||_1) 1.962239e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.951395e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.498817e-15 max(|| b_i - A x_i ||_1) 1.962239e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.951395e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.962239e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.951395e-03 (SUCCESS) Start 2922: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpend Test #2938: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrrtend .................***Timeout 275.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.235849e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.639092e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.051637e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.979592e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.716580e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.945973e-03 s Time to initialize coeftab 5.607886e-02 s Time to factorize 2.645382e+00 s ( 8.05 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.263254e-01 s - iteration 1 : total iteration time 0.548 s error 5.4805e-14 Time for refinement 1.347159e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.480503e-14 max(|| b_i - A x_i ||_1) 3.619996e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.134477e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.480503e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.480503e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.480503e-14 max(|| b_i - A x_i ||_1) 3.619996e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.134477e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.619996e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.134477e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.619996e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.134477e-02 (SUCCESS) Start 2938: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrrtend Test #2939: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrrtbegin ..............***Timeout 275.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.247374e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.624612e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.591091e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.392844e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.555982e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.513636e-03 s Time to initialize coeftab 2.969361e-01 s Time to factorize 8.564741e+00 s ( 2.49 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.504980e-01 s - iteration 1 : total iteration time 0.803 s error 2.7929e-13 Time for refinement 1.467940e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.792878e-13 max(|| b_i - A x_i ||_1) 5.258121e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.326802e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.792878e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.792878e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.792878e-13 max(|| b_i - A x_i ||_1) 5.258121e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.326802e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 5.258121e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.326802e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 5.258121e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.326802e+00 (SUCCESS) Start 2939: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrrtbegin Test #2953: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpbegin ..............***Timeout 280.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.005262e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.541142e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.800702e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.096691e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.322803e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.011935e-01 s Time to initialize coeftab 2.978641e-01 s Time to factorize 4.718894e+00 s ( 8.47 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 3.869331e-01 s - iteration 1 : total iteration time 1.12 s error 1.1833e-14 Time for refinement 2.518352e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.183454e-14 max(|| b_i - A x_i ||_1) 1.947392e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.913931e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.183454e-14 max(|| b_i - A x_i ||_1) 1.947392e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.913931e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.183454e-14 max(|| b_i - A x_i ||_1) 1.947392e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.913931e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.183454e-14 max(|| b_i - A x_i ||_1) 1.947392e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.913931e-02 (SUCCESS) Start 2953: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpbegin Test #2969: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrrtbegin ...............***Timeout 285.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.160657e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.796398e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.642410e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.208601e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.460916e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.810988e-03 s Time to initialize coeftab 2.388930e-01 s Time to factorize 8.765248e+00 s ( 4.56 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 5.560613e-01 s - iteration 1 : total iteration time 0.599 s error 2.5398e-13 Time for refinement 1.592993e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.539831e-13 max(|| b_i - A x_i ||_1) 3.667954e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.255492e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.539831e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.539831e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.539831e-13 max(|| b_i - A x_i ||_1) 3.667954e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.255492e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.667954e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.255492e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.667954e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.255492e-01 (SUCCESS) Start 2969: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrrtbegin Test #2989: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrcpbegin ...............***Timeout 290.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.136687e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.889951e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.978489e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.410698e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.669562e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.023988e-02 s Time to initialize coeftab 6.308195e-01 s Time to factorize 2.858991e+01 s (726.40 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.677058e+00 s - iteration 1 : total iteration time 1.71 s error 1.6766e-14 Time for refinement 3.052125e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.676749e-14 max(|| b_i - A x_i ||_1) 2.823163e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.123799e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.676749e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.676749e-14 max(|| b_i - A x_i ||_1) 2.823163e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.123799e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.676749e-14 max(|| b_i - A x_i ||_1) 2.823163e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.123799e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.823163e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.123799e-02 (SUCCESS) Start 2989: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrcpbegin Test #2997: mpi_dst_example_simple_lap_z_facto3_sched1_kway_tqrcpbegin ..............***Timeout 292.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.448658e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.573307e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.468236e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.705572e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.257983e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.695743e-01 s Time to initialize coeftab 8.350820e-01 s Time to factorize 9.973758e+00 s ( 2.03 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.342697e+00 s - iteration 1 : total iteration time 1.17 s error 3.6766e-14 Time for refinement 4.993738e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.677053e-14 max(|| b_i - A x_i ||_1) 5.130314e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.294552e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.677053e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.677053e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.677053e-14 max(|| b_i - A x_i ||_1) 5.130314e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.294552e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 5.130314e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.294552e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 5.130314e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.294552e-01 (SUCCESS) Start 2997: mpi_dst_example_simple_lap_z_facto3_sched1_kway_tqrcpbegin Test #3001: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrrtbegin ...............***Timeout 293.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.632081e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.459565e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.801415e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.911057e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.063411e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.383578e-02 s Time to initialize coeftab 2.485281e-01 s Time to factorize 2.157613e+01 s (962.53 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.685066e+00 s - iteration 1 : total iteration time 2.55 s error 2.8848e-13 Time for refinement 3.957922e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.884811e-13 max(|| b_i - A x_i ||_1) 6.116728e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.543458e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.884811e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.884811e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.884811e-13 max(|| b_i - A x_i ||_1) 6.116728e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.543458e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 6.116728e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.543458e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 6.116728e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.543458e+00 (SUCCESS) Start 3001: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrrtbegin Test #3003: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrrtbegin ..............***Timeout 294.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.931098e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.272983e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.060007e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.700991e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.837884e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.166642e-01 s Time to initialize coeftab 7.225490e-01 s Time to factorize 2.337338e+01 s (888.52 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 5.734734e-01 s - iteration 1 : total iteration time 1.71 s error 2.374e-13 Time for refinement 2.770268e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.374023e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.374023e-13 max(|| b_i - A x_i ||_1) 4.703326e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.186809e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.374023e-13 max(|| b_i - A x_i ||_1) 4.703326e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.186809e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.374023e-13 max(|| b_i - A x_i ||_1) 4.703326e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.186809e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 4.703326e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.186809e+00 (SUCCESS) Start 3003: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrrtbegin Test #3004: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrrtend ................***Timeout 294.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.085885e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.801878e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.148359e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.345060e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.326982e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.708404e-03 s Time to initialize coeftab 4.430217e-02 s Start 3004: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrrtend Test #3005: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtbegin ...***Timeout 294.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.852523e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.820308e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.668826e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.412019e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.085598e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.017067e-02 s Time to initialize coeftab 6.911859e-01 s Time to factorize 2.316557e+01 s (896.49 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.101056e+00 s - iteration 1 : total iteration time 1.43 s error 2.7938e-13 Time for refinement 2.015552e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.793815e-13 max(|| b_i - A x_i ||_1) 5.654455e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.426811e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.793815e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.793815e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.793815e-13 max(|| b_i - A x_i ||_1) 5.654455e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.426811e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 5.654455e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.426811e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 5.654455e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.426811e+00 (SUCCESS) Start 3005: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtbegin Test #3008: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpilu1 ...............***Timeout 296.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.293626e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.765587e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.110470e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.544790e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.210825e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.007983e-02 s Time to initialize coeftab 8.161254e-02 s Time to factorize 2.602620e+00 s ( 7.79 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 4.694834e-01 s - iteration 1 : total iteration time 0.634 s error 1.061e-14 Time for refinement 1.883860e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.061010e-14 max(|| b_i - A x_i ||_1) 1.449326e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.657140e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.061010e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.061010e-14 max(|| b_i - A x_i ||_1) 1.449326e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.657140e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.061010e-14 max(|| b_i - A x_i ||_1) 1.449326e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.657140e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.449326e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.657140e-02 (SUCCESS) Start 3008: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpilu1 Test #3020: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpend .....***Timeout 300.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.359106e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.016683e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.787100e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.217011e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.114292e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.791145e-01 s Time to initialize coeftab 1.677407e-01 s Start 3020: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpend Test #3049: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpbegin ..............***Timeout 308.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.292981e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.389187e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.349356e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.371011e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.031676e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.161163e-01 s Time to initialize coeftab 1.585210e-01 s Time to factorize 2.037919e+00 s ( 2.48 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.506929e-01 s - iteration 1 : total iteration time 1.68 s error 3.005e-11 Time for refinement 2.617350e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.900752e-08 max(|| b_i - A x_i ||_1) 2.947093e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.703288e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.900752e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.900752e-08 max(|| b_i - A x_i ||_1) 2.947093e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.703288e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.900752e-08 max(|| b_i - A x_i ||_1) 2.947093e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.703288e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.947093e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.703288e-01 (SUCCESS) Start 3049: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpbegin Test #3057: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpbegin ...***Timeout 311.98 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.813549e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.139091e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.055446e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.763822e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.083950e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.661462e-03 s Time to initialize coeftab 1.651411e-01 s Time to factorize 5.254427e+00 s (986.56 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 9.769270e-01 s - iteration 1 : total iteration time 1.49 s error 4.9412e-11 Time for refinement 2.563626e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.199435e-08 max(|| b_i - A x_i ||_1) 3.038193e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.817763e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.199435e-08 max(|| b_i - A x_i ||_1) 3.038193e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.817763e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.199435e-08 max(|| b_i - A x_i ||_1) 3.038193e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.817763e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.199435e-08 max(|| b_i - A x_i ||_1) 3.038193e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.817763e-01 (SUCCESS) Start 3057: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpbegin 2746/3626 Test #3160: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpend .....***Timeout 316.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.526052e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.654442e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.564532e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.445813e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.742228e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.466743e-02 s Time to initialize coeftab 3.448049e-02 s Time to factorize 1.113452e+00 s ( 4.55 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.116040e+00 s - iteration 1 : total iteration time 1.37 s error 9.0472e-16 Time for refinement 2.123379e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.131076e-16 max(|| b_i - A x_i ||_1) 1.216679e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.528862e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.131076e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.131076e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.131076e-16 max(|| b_i - A x_i ||_1) 1.216679e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.528862e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.216679e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.528862e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.216679e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.528862e-03 (SUCCESS) Start 3160: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpend Start 3189: mpi_dst_example_simple_lap_d_facto1_sched4_kway_tqrcpbegin Start 3190: mpi_dst_example_simple_lap_d_facto1_sched4_kway_tqrcpend Start 3191: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpbegin Start 3192: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpend Start 3193: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrrtbegin Start 3194: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrrtend Start 3195: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrrtbegin Start 3196: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrrtend Start 3197: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtbegin Start 3198: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtend Start 3199: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpilu0 Start 3200: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpilu1 Start 3201: mpi_dst_example_simple_lap_d_facto2_sched4_not_svdbegin Start 3202: mpi_dst_example_simple_lap_d_facto2_sched4_not_svdend Start 3203: mpi_dst_example_simple_lap_d_facto2_sched4_kway_svdbegin Start 3204: mpi_dst_example_simple_lap_d_facto2_sched4_kway_svdend Start 3205: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_svdbegin Start 3206: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_svdend Start 3207: mpi_dst_example_simple_lap_d_facto2_sched4_not_pqrcpbegin Start 3208: mpi_dst_example_simple_lap_d_facto2_sched4_not_pqrcpend Start 3209: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpbegin Start 3210: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpend Start 3211: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpbegin Start 3212: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpend Start 3213: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrcpbegin Start 3214: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrcpend Start 3215: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrcpbegin Start 3216: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrcpend Start 3217: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpbegin Start 3218: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpend Start 3219: mpi_dst_example_simple_lap_d_facto2_sched4_not_tqrcpbegin Start 3220: mpi_dst_example_simple_lap_d_facto2_sched4_not_tqrcpend Start 3221: mpi_dst_example_simple_lap_d_facto2_sched4_kway_tqrcpbegin Start 3222: mpi_dst_example_simple_lap_d_facto2_sched4_kway_tqrcpend Start 3223: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpbegin Start 3224: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpend Start 3225: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrrtbegin Start 3226: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrrtend Start 3227: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrrtbegin Start 3228: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrrtend Start 3229: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtbegin Start 3230: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtend Start 3231: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpilu0 Start 3232: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpilu1 Start 3233: mpi_dst_example_simple_lap_c_facto0_sched4_not_svdbegin Start 3234: mpi_dst_example_simple_lap_c_facto0_sched4_not_svdend Start 3235: mpi_dst_example_simple_lap_c_facto0_sched4_kway_svdbegin Start 3236: mpi_dst_example_simple_lap_c_facto0_sched4_kway_svdend Start 3237: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_svdbegin Start 3238: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_svdend Start 3239: mpi_dst_example_simple_lap_c_facto0_sched4_not_pqrcpbegin Start 3240: mpi_dst_example_simple_lap_c_facto0_sched4_not_pqrcpend Start 3241: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpbegin Start 3242: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpend Start 3243: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpbegin Start 3244: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpend Start 3245: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrcpbegin Start 3246: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrcpend Start 3247: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrcpbegin Start 3248: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrcpend Start 3249: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpbegin Start 3250: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpend Start 3251: mpi_dst_example_simple_lap_c_facto0_sched4_not_tqrcpbegin Start 3252: mpi_dst_example_simple_lap_c_facto0_sched4_not_tqrcpend Start 3253: mpi_dst_example_simple_lap_c_facto0_sched4_kway_tqrcpbegin Start 3254: mpi_dst_example_simple_lap_c_facto0_sched4_kway_tqrcpend Start 3255: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpbegin Start 3256: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpend Start 3257: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrrtbegin 2746/3626 Test #3171: mpi_dst_example_simple_lap_d_facto1_sched4_kway_svdbegin ................ Passed 280.90 sec Test #2813: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtbegin ... Passed 298.67 sec 2748/3626 Test #3165: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtbegin ... Passed 299.73 sec 2749/3626 Test #3173: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_svdbegin ..... Passed 274.99 sec 2750/3626 Test #3185: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpbegin ... Passed 250.57 sec 2751/3626 Test #3184: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrcpend ................ Passed 252.08 sec 2752/3626 Test #3187: mpi_dst_example_simple_lap_d_facto1_sched4_not_tqrcpbegin ............... Passed 250.14 sec 2753/3626 Test #3188: mpi_dst_example_simple_lap_d_facto1_sched4_not_tqrcpend ................. Passed 248.82 sec 2754/3626 Test #3172: mpi_dst_example_simple_lap_d_facto1_sched4_kway_svdend .................. Passed 278.85 sec Test #2904: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpend .....***Timeout 444.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.541540e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.397148e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.110548e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.778374e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.376952e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.812679e+00 s Time to initialize coeftab 2.384234e-01 s Start 2904: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpend Test #2905: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrrtbegin ...............***Timeout 444.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.648384e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.602639e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.003185e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.135502e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.855765e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.033052e+00 s Time to initialize coeftab 1.213759e+00 s Start 2905: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrrtbegin Test #2907: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrrtbegin ..............***Timeout 444.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.619918e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.642946e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.803701e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.783316e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.380549e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.827857e-01 s Time to initialize coeftab 8.307656e-01 s Start 2907: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrrtbegin Test #2910: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtend .....***Timeout 444.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2910: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtend Test #2912: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpilu1 ...............***Timeout 444.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.682605e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.266192e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.078665e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.054703e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.100444e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.861650e+00 s Time to initialize coeftab 3.779900e-01 s Start 2912: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpilu1 Test #2915: mpi_dst_example_simple_lap_z_facto1_sched1_kway_svdbegin ................***Timeout 444.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.594801e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.252124e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Start 2915: mpi_dst_example_simple_lap_z_facto1_sched1_kway_svdbegin Test #2916: mpi_dst_example_simple_lap_z_facto1_sched1_kway_svdend ..................***Timeout 444.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.131626e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.219966e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.451927e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.898455e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.001005e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.032036e+00 s Time to initialize coeftab 2.577186e-01 s Start 2916: mpi_dst_example_simple_lap_z_facto1_sched1_kway_svdend 2755/3626 Test #3063: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_tqrcpbegin ...***Timeout 444.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.771129e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.789758e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.115152e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.995576e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.336989e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Start 3063: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_tqrcpbegin 2755/3626 Test #3075: mpi_dst_example_simple_lap_s_facto1_sched4_kway_svdbegin ................***Timeout 444.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3075: mpi_dst_example_simple_lap_s_facto1_sched4_kway_svdbegin 2755/3626 Test #3076: mpi_dst_example_simple_lap_s_facto1_sched4_kway_svdend ..................***Timeout 444.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.549008e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.228910e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.538192e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3076: mpi_dst_example_simple_lap_s_facto1_sched4_kway_svdend 2755/3626 Test #3077: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_svdbegin .....***Timeout 444.78 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.015764e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.356686e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.079839e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.741125e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.346347e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.053802e-01 s Time to initialize coeftab 5.239958e-01 s Start 3077: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_svdbegin 2755/3626 Test #3078: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_svdend .......***Timeout 444.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.491141e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.789214e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.955446e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.617503e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.212049e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.016784e+00 s Time to initialize coeftab 1.659786e-01 s Start 3078: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_svdend 2755/3626 Test #3079: mpi_dst_example_simple_lap_s_facto1_sched4_not_pqrcpbegin ...............***Timeout 444.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.958996e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.406079e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.568359e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.374710e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.879405e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.152697e+00 s Time to initialize coeftab 1.942646e-01 s Time to factorize 5.855647e+00 s (915.20 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Start 3079: mpi_dst_example_simple_lap_s_facto1_sched4_not_pqrcpbegin 2755/3626 Test #3085: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrcpbegin ...............***Timeout 444.82 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3085: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrcpbegin 2755/3626 Test #3091: mpi_dst_example_simple_lap_s_facto1_sched4_not_tqrcpbegin ...............***Timeout 444.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.573968e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.436964e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.490491e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.280436e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.021502e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.540375e-01 s Time to initialize coeftab 7.288431e-01 s Time to factorize 1.629840e+01 s (328.81 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko Start 3091: mpi_dst_example_simple_lap_s_facto1_sched4_not_tqrcpbegin 2755/3626 Test #3095: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpbegin ...***Timeout 444.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.829708e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.505371e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.397669e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.377530e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.215451e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.852781e+00 s Time to initialize coeftab 7.220145e-01 s Time to factorize 1.192306e+01 s (449.47 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Start 3095: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpbegin 2755/3626 Test #3098: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrrtend .................***Timeout 444.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.427665e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.756402e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.336157e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.252436e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.529483e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.271572e-01 s Time to initialize coeftab 1.100041e-01 s Start 3098: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrrtend 2755/3626 Test #3101: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtbegin ...***Timeout 443.59 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.917105e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.724008e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.502837e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.997136e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.002773e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.028137e+00 s Time to initialize coeftab 7.510302e-01 s Time to factorize 1.549036e+01 s (345.96 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Start 3101: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtbegin 2755/3626 Test #3110: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_svdend .......***Timeout 443.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.183804e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.562153e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.087521e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.510286e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.235685e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.685345e-01 s Time to initialize coeftab 1.333722e-01 s Start 3110: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_svdend 2755/3626 Test #3111: mpi_dst_example_simple_lap_s_facto2_sched4_not_pqrcpbegin ...............***Timeout 443.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.837451e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.725294e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.188778e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.083629e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.398844e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.704207e-01 s Time to initialize coeftab 2.587932e-01 s Time to factorize 3.959131e+00 s ( 2.52 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Start 3111: mpi_dst_example_simple_lap_s_facto2_sched4_not_pqrcpbegin 2755/3626 Test #3113: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpbegin ..............***Timeout 443.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.090259e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.234537e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.927838e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.154601e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.119475e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.452159e+00 s Time to initialize coeftab 5.349988e-01 s Time to factorize 7.582609e+00 s ( 1.32 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Start 3113: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpbegin 2755/3626 Test #3117: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrcpbegin ...............***Timeout 442.84 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.474126e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.273695e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.006831e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.660161e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.266312e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.040508e+00 s Time to initialize coeftab 9.029829e-01 s Time to factorize 1.087801e+01 s (939.90 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.2 Ko / 88.6 Ko ------------------------------------------------ Total 112 Ko / 113 Ko Start 3117: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrcpbegin 2755/3626 Test #3119: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrcpbegin ..............***Timeout 442.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.163376e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.029710e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.436396e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.367130e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.244239e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.926471e+00 s Time to initialize coeftab 1.597354e+00 s Time to factorize 1.260153e+01 s (811.35 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Start 3119: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrcpbegin 2755/3626 Test #3121: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpbegin ...***Timeout 442.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.145396e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.602741e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.084888e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.764867e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.164706e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.464939e+00 s Time to initialize coeftab 6.153410e-01 s Time to factorize 1.886435e+01 s (541.99 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 112 Ko / 113 Ko Start 3121: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpbegin 2755/3626 Test #3123: mpi_dst_example_simple_lap_s_facto2_sched4_not_tqrcpbegin ...............***Timeout 441.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.204247e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.763611e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.404803e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.495652e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.029366e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.147160e+00 s Time to initialize coeftab 5.738470e-01 s Time to factorize 1.595458e+01 s (640.83 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Start 3123: mpi_dst_example_simple_lap_s_facto2_sched4_not_tqrcpbegin Test #2913: mpi_dst_example_simple_lap_z_facto1_sched1_not_svdbegin .................***Timeout 438.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.263191e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.334145e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.905455e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 2913: mpi_dst_example_simple_lap_z_facto1_sched1_not_svdbegin 2755/3626 Test #3126: mpi_dst_example_simple_lap_s_facto2_sched4_kway_tqrcpend ................***Timeout 438.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.763779e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.275875e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.302389e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.828309e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.610058e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.639050e-01 s Time to initialize coeftab 2.212851e-01 s Time to factorize 8.249835e+00 s ( 1.21 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Start 3126: mpi_dst_example_simple_lap_s_facto2_sched4_kway_tqrcpend 2755/3626 Test #3127: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpbegin ...***Timeout 438.22 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.529724e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.332166e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.954833e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.850006e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.634266e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.113627e-01 s Start 3127: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpbegin 2755/3626 Test #3129: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrrtbegin ...............***Timeout 438.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3129: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrrtbegin 2755/3626 Test #3130: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrrtend .................***Timeout 438.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3130: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrrtend 2755/3626 Test #3137: mpi_dst_example_simple_lap_d_facto0_sched4_not_svdbegin .................***Timeout 437.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3137: mpi_dst_example_simple_lap_d_facto0_sched4_not_svdbegin 2755/3626 Test #3139: mpi_dst_example_simple_lap_d_facto0_sched4_kway_svdbegin ................***Timeout 437.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3139: mpi_dst_example_simple_lap_d_facto0_sched4_kway_svdbegin 2755/3626 Test #3140: mpi_dst_example_simple_lap_d_facto0_sched4_kway_svdend ..................***Timeout 437.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.614025e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.614491e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.248980e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.867162e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.123778e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.033244e+00 s Time to initialize coeftab 2.769961e-01 s Time to factorize 2.032064e+01 s (255.10 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3140: mpi_dst_example_simple_lap_d_facto0_sched4_kway_svdend 2755/3626 Test #3141: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_svdbegin .....***Timeout 437.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.511405e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.696801e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.564967e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.006728e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.128546e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.342814e+00 s Time to initialize coeftab 1.087648e+00 s Start 3141: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_svdbegin 2755/3626 Test #3142: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_svdend .......***Timeout 437.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.985892e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.523116e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.223489e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.884223e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.178372e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.537215e+00 s Time to initialize coeftab 3.078653e-01 s Start 3142: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_svdend 2755/3626 Test #3145: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpbegin ..............***Timeout 437.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3145: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpbegin 2755/3626 Test #3146: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpend ................***Timeout 437.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3146: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpend 2755/3626 Test #3148: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpend .....***Timeout 437.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.304945e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.173287e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.166679e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.613870e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.224355e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Start 3148: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpend 2755/3626 Test #3151: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrcpbegin ..............***Timeout 437.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.358421e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.134562e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.490402e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.437467e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.289198e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.767313e-01 s Time to initialize coeftab 4.254392e-01 s Start 3151: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrcpbegin 2755/3626 Test #3153: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpbegin ...***Timeout 437.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3153: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpbegin 2755/3626 Test #3154: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpend .....***Timeout 437.59 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3154: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpend 2755/3626 Test #3155: mpi_dst_example_simple_lap_d_facto0_sched4_not_tqrcpbegin ...............***Timeout 437.59 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.528350e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.514370e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.105500e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.089607e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.045246e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.216368e+00 s Time to initialize coeftab 8.420153e-01 s Time to factorize 1.342504e+01 s (386.13 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 3155: mpi_dst_example_simple_lap_d_facto0_sched4_not_tqrcpbegin 2755/3626 Test #3156: mpi_dst_example_simple_lap_d_facto0_sched4_not_tqrcpend .................***Timeout 437.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.488559e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.040174e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.076433e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.783962e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.288758e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.752589e+00 s Time to initialize coeftab 3.126377e-01 s Time to factorize 6.730663e+00 s (770.18 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3156: mpi_dst_example_simple_lap_d_facto0_sched4_not_tqrcpend 2755/3626 Test #3157: mpi_dst_example_simple_lap_d_facto0_sched4_kway_tqrcpbegin ..............***Timeout 437.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3157: mpi_dst_example_simple_lap_d_facto0_sched4_kway_tqrcpbegin Test #2917: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_svdbegin .....***Timeout 334.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2917: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_svdbegin Test #2918: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_svdend .......***Timeout 334.21 sec ischedInit: The thread number has been automatically set to 256 Start 2918: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_svdend Test #2919: mpi_dst_example_simple_lap_z_facto1_sched1_not_pqrcpbegin ...............***Timeout 334.22 sec ischedInit: The thread number has been automatically set to 256 Start 2919: mpi_dst_example_simple_lap_z_facto1_sched1_not_pqrcpbegin Test #2920: mpi_dst_example_simple_lap_z_facto1_sched1_not_pqrcpend .................***Timeout 334.22 sec Start 2920: mpi_dst_example_simple_lap_z_facto1_sched1_not_pqrcpend Test #2921: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpbegin ..............***Timeout 334.22 sec Start 2921: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpbegin Test #2923: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpbegin ...***Timeout 334.21 sec Start 2923: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpbegin Test #2924: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpend .....***Timeout 334.22 sec Start 2924: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpend Test #2925: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrcpbegin ...............***Timeout 334.30 sec Start 2925: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrcpbegin Test #2926: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrcpend .................***Timeout 334.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2926: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrcpend Test #2927: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrcpbegin ..............***Timeout 334.31 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2927: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrcpbegin Test #2928: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrcpend ................***Timeout 334.31 sec Start 2928: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrcpend Test #2929: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpbegin ...***Timeout 334.31 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2929: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpbegin Test #2930: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpend .....***Timeout 334.32 sec Start 2930: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpend Test #2931: mpi_dst_example_simple_lap_z_facto1_sched1_not_tqrcpbegin ...............***Timeout 334.32 sec Start 2931: mpi_dst_example_simple_lap_z_facto1_sched1_not_tqrcpbegin Test #2932: mpi_dst_example_simple_lap_z_facto1_sched1_not_tqrcpend .................***Timeout 334.26 sec ischedInit: The thread number has been automatically set to 256 Start 2932: mpi_dst_example_simple_lap_z_facto1_sched1_not_tqrcpend Test #2933: mpi_dst_example_simple_lap_z_facto1_sched1_kway_tqrcpbegin ..............***Timeout 334.24 sec ischedInit: The thread number has been automatically set to 256 Start 2933: mpi_dst_example_simple_lap_z_facto1_sched1_kway_tqrcpbegin Test #2934: mpi_dst_example_simple_lap_z_facto1_sched1_kway_tqrcpend ................***Timeout 333.95 sec Start 2934: mpi_dst_example_simple_lap_z_facto1_sched1_kway_tqrcpend Test #2935: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpbegin ...***Timeout 333.72 sec Start 2935: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpbegin Test #2936: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpend .....***Timeout 333.62 sec Start 2936: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpend Test #2937: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrrtbegin ...............***Timeout 333.62 sec Start 2937: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrrtbegin Test #2940: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrrtend ................***Timeout 333.45 sec Start 2940: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrrtend Test #2941: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtbegin ...***Timeout 333.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2941: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtbegin Test #2942: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtend .....***Timeout 333.38 sec Start 2942: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtend Test #2943: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpilu0 ...............***Timeout 333.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2943: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpilu0 Test #2944: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpilu1 ...............***Timeout 333.39 sec Start 2944: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpilu1 Test #2945: mpi_dst_example_simple_lap_z_facto2_sched1_not_svdbegin .................***Timeout 333.40 sec Start 2945: mpi_dst_example_simple_lap_z_facto2_sched1_not_svdbegin Test #2946: mpi_dst_example_simple_lap_z_facto2_sched1_not_svdend ...................***Timeout 333.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2946: mpi_dst_example_simple_lap_z_facto2_sched1_not_svdend Test #2947: mpi_dst_example_simple_lap_z_facto2_sched1_kway_svdbegin ................***Timeout 333.39 sec ischedInit: The thread number has been automatically set to 256 Start 2947: mpi_dst_example_simple_lap_z_facto2_sched1_kway_svdbegin Test #2948: mpi_dst_example_simple_lap_z_facto2_sched1_kway_svdend ..................***Timeout 333.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2948: mpi_dst_example_simple_lap_z_facto2_sched1_kway_svdend Test #2949: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_svdbegin .....***Timeout 333.38 sec Start 2949: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_svdbegin Test #2950: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_svdend .......***Timeout 333.35 sec Start 2950: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_svdend Test #2951: mpi_dst_example_simple_lap_z_facto2_sched1_not_pqrcpbegin ...............***Timeout 333.32 sec Start 2951: mpi_dst_example_simple_lap_z_facto2_sched1_not_pqrcpbegin Test #2952: mpi_dst_example_simple_lap_z_facto2_sched1_not_pqrcpend .................***Timeout 333.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2952: mpi_dst_example_simple_lap_z_facto2_sched1_not_pqrcpend Test #2954: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpend ................***Timeout 333.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.055836e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.621236e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.459484e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.197730e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.640864e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.355845e-01 s Time to initialize coeftab 1.950205e-01 s Start 2954: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpend Test #2955: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpbegin ...***Timeout 333.31 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2955: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpbegin Test #2956: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpend .....***Timeout 333.35 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2956: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpend Test #2957: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrcpbegin ...............***Timeout 333.33 sec Start 2957: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrcpbegin Test #2958: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrcpend .................***Timeout 333.29 sec ischedInit: The thread number has been automatically set to 256 Start 2958: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrcpend Test #2959: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrcpbegin ..............***Timeout 333.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2959: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrcpbegin Test #2960: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrcpend ................***Timeout 333.35 sec ischedInit: The thread number has been automatically set to 256 Start 2960: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrcpend Test #2961: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpbegin ...***Timeout 333.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2961: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpbegin Test #2962: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpend .....***Timeout 333.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2962: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpend Test #2963: mpi_dst_example_simple_lap_z_facto2_sched1_not_tqrcpbegin ...............***Timeout 333.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2963: mpi_dst_example_simple_lap_z_facto2_sched1_not_tqrcpbegin Test #2964: mpi_dst_example_simple_lap_z_facto2_sched1_not_tqrcpend .................***Timeout 333.29 sec Start 2964: mpi_dst_example_simple_lap_z_facto2_sched1_not_tqrcpend Test #2965: mpi_dst_example_simple_lap_z_facto2_sched1_kway_tqrcpbegin ..............***Timeout 333.25 sec Start 2965: mpi_dst_example_simple_lap_z_facto2_sched1_kway_tqrcpbegin Test #2966: mpi_dst_example_simple_lap_z_facto2_sched1_kway_tqrcpend ................***Timeout 333.29 sec Start 2966: mpi_dst_example_simple_lap_z_facto2_sched1_kway_tqrcpend Test #2967: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpbegin ...***Timeout 333.31 sec Start 2967: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpbegin Test #2968: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpend .....***Timeout 333.35 sec Start 2968: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpend Test #2970: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrrtend .................***Timeout 333.31 sec Start 2970: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrrtend Test #2971: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrrtbegin ..............***Timeout 333.25 sec Start 2971: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrrtbegin Test #2972: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrrtend ................***Timeout 333.19 sec Start 2972: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrrtend Test #2973: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtbegin ...***Timeout 333.18 sec Start 2973: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtbegin Test #2974: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtend .....***Timeout 333.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 2974: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtend Test #2975: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpilu0 ...............***Timeout 333.25 sec ischedInit: The thread number has been automatically set to 256 Start 2975: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpilu0 Test #2976: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpilu1 ...............***Timeout 333.31 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2976: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpilu1 Test #2977: mpi_dst_example_simple_lap_z_facto3_sched1_not_svdbegin .................***Timeout 333.39 sec ischedInit: The thread number has been automatically set to 256 Start 2977: mpi_dst_example_simple_lap_z_facto3_sched1_not_svdbegin Test #2978: mpi_dst_example_simple_lap_z_facto3_sched1_not_svdend ...................***Timeout 333.46 sec Start 2978: mpi_dst_example_simple_lap_z_facto3_sched1_not_svdend Test #2979: mpi_dst_example_simple_lap_z_facto3_sched1_kway_svdbegin ................***Timeout 333.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2979: mpi_dst_example_simple_lap_z_facto3_sched1_kway_svdbegin Test #2980: mpi_dst_example_simple_lap_z_facto3_sched1_kway_svdend ..................***Timeout 333.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2980: mpi_dst_example_simple_lap_z_facto3_sched1_kway_svdend Test #2981: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_svdbegin .....***Timeout 333.41 sec ischedInit: The thread number has been automatically set to 256 Start 2981: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_svdbegin Test #2982: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_svdend .......***Timeout 333.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2982: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_svdend Test #2983: mpi_dst_example_simple_lap_z_facto3_sched1_not_pqrcpbegin ...............***Timeout 333.54 sec Start 2983: mpi_dst_example_simple_lap_z_facto3_sched1_not_pqrcpbegin Test #2984: mpi_dst_example_simple_lap_z_facto3_sched1_not_pqrcpend .................***Timeout 333.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2984: mpi_dst_example_simple_lap_z_facto3_sched1_not_pqrcpend Test #2985: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpbegin ..............***Timeout 333.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.814092e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.851711e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.806466e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.343916e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.027432e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.745511e+00 s Time to initialize coeftab 5.660891e-01 s Time to factorize 1.952094e+01 s ( 1.04 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 2985: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpbegin Test #2986: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpend ................***Timeout 333.39 sec ischedInit: The thread number has been automatically set to 256 Start 2986: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpend Test #2987: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpbegin ...***Timeout 333.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 2987: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpbegin Test #2988: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpend .....***Timeout 333.42 sec Start 2988: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpend Test #2990: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrcpend .................***Timeout 333.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2990: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrcpend Test #2991: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrcpbegin ..............***Timeout 333.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.079380e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.217200e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.176329e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.149767e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.277161e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Start 2991: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrcpbegin Test #2992: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrcpend ................***Timeout 333.62 sec Start 2992: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrcpend Test #2993: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpbegin ...***Timeout 333.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 2993: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpbegin Test #2994: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpend .....***Timeout 333.80 sec Start 2994: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpend Test #2995: mpi_dst_example_simple_lap_z_facto3_sched1_not_tqrcpbegin ...............***Timeout 333.88 sec Start 2995: mpi_dst_example_simple_lap_z_facto3_sched1_not_tqrcpbegin Test #2996: mpi_dst_example_simple_lap_z_facto3_sched1_not_tqrcpend .................***Timeout 334.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 2996: mpi_dst_example_simple_lap_z_facto3_sched1_not_tqrcpend Test #2998: mpi_dst_example_simple_lap_z_facto3_sched1_kway_tqrcpend ................***Timeout 334.08 sec ischedInit: The thread number has been automatically set to 256 Start 2998: mpi_dst_example_simple_lap_z_facto3_sched1_kway_tqrcpend Test #2999: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpbegin ...***Timeout 334.04 sec Start 2999: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpbegin Test #3000: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpend .....***Timeout 334.02 sec ischedInit: The thread number has been automatically set to 256 Start 3000: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpend Test #3002: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrrtend .................***Timeout 333.94 sec ischedInit: The thread number has been automatically set to 256 Start 3002: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrrtend Test #3006: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtend .....***Timeout 333.69 sec Start 3006: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtend Test #3007: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpilu0 ...............***Timeout 333.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3007: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpilu0 Test #3009: mpi_dst_example_simple_lap_z_facto4_sched1_not_svdbegin .................***Timeout 333.56 sec Start 3009: mpi_dst_example_simple_lap_z_facto4_sched1_not_svdbegin Test #3010: mpi_dst_example_simple_lap_z_facto4_sched1_not_svdend ...................***Timeout 333.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3010: mpi_dst_example_simple_lap_z_facto4_sched1_not_svdend Test #3011: mpi_dst_example_simple_lap_z_facto4_sched1_kway_svdbegin ................***Timeout 333.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3011: mpi_dst_example_simple_lap_z_facto4_sched1_kway_svdbegin Test #3012: mpi_dst_example_simple_lap_z_facto4_sched1_kway_svdend ..................***Timeout 333.64 sec Start 3012: mpi_dst_example_simple_lap_z_facto4_sched1_kway_svdend Test #3013: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_svdbegin .....***Timeout 333.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3013: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_svdbegin Test #3014: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_svdend .......***Timeout 333.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3014: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_svdend Test #3015: mpi_dst_example_simple_lap_z_facto4_sched1_not_pqrcpbegin ...............***Timeout 333.77 sec ischedInit: The thread number has been automatically set to 256 Start 3015: mpi_dst_example_simple_lap_z_facto4_sched1_not_pqrcpbegin Test #3016: mpi_dst_example_simple_lap_z_facto4_sched1_not_pqrcpend .................***Timeout 333.89 sec Start 3016: mpi_dst_example_simple_lap_z_facto4_sched1_not_pqrcpend Test #3017: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpbegin ..............***Timeout 333.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3017: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpbegin Test #3018: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpend ................***Timeout 333.91 sec Start 3018: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpend Test #3019: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpbegin ...***Timeout 333.89 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3019: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpbegin Test #3021: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrcpbegin ...............***Timeout 333.82 sec Start 3021: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrcpbegin Test #3022: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrcpend .................***Timeout 333.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3022: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrcpend Test #3023: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrcpbegin ..............***Timeout 333.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3023: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrcpbegin Test #3024: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrcpend ................***Timeout 333.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3024: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrcpend Test #3025: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpbegin ...***Timeout 333.91 sec ischedInit: The thread number has been automatically set to 256 Start 3025: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpbegin Test #3026: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpend .....***Timeout 333.90 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3026: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpend Test #3027: mpi_dst_example_simple_lap_z_facto4_sched1_not_tqrcpbegin ...............***Timeout 333.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3027: mpi_dst_example_simple_lap_z_facto4_sched1_not_tqrcpbegin Test #3028: mpi_dst_example_simple_lap_z_facto4_sched1_not_tqrcpend .................***Timeout 333.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3028: mpi_dst_example_simple_lap_z_facto4_sched1_not_tqrcpend Test #3029: mpi_dst_example_simple_lap_z_facto4_sched1_kway_tqrcpbegin ..............***Timeout 334.06 sec ischedInit: The thread number has been automatically set to 256 Start 3029: mpi_dst_example_simple_lap_z_facto4_sched1_kway_tqrcpbegin Test #3030: mpi_dst_example_simple_lap_z_facto4_sched1_kway_tqrcpend ................***Timeout 334.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3030: mpi_dst_example_simple_lap_z_facto4_sched1_kway_tqrcpend Test #3031: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpbegin ...***Timeout 334.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3031: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpbegin Test #3032: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpend .....***Timeout 334.12 sec ischedInit: The thread number has been automatically set to 256 Start 3032: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpend Test #3033: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrrtbegin ...............***Timeout 334.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3033: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrrtbegin Test #3034: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrrtend .................***Timeout 334.22 sec Start 3034: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrrtend Test #3035: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrrtbegin ..............***Timeout 334.24 sec Start 3035: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrrtbegin Test #3036: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrrtend ................***Timeout 334.12 sec Start 3036: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrrtend Test #3037: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtbegin ...***Timeout 334.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3037: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtbegin Test #3038: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtend .....***Timeout 334.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.584497e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.267797e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.173443e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.050379e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.054275e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Start 3038: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtend Test #3039: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpilu0 ...............***Timeout 334.12 sec Start 3039: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpilu0 Test #3040: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpilu1 ...............***Timeout 334.17 sec ischedInit: The thread number has been automatically set to 256 Start 3040: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpilu1 Test #3041: mpi_dst_example_simple_lap_s_facto0_sched4_not_svdbegin .................***Timeout 334.22 sec Start 3041: mpi_dst_example_simple_lap_s_facto0_sched4_not_svdbegin Test #3042: mpi_dst_example_simple_lap_s_facto0_sched4_not_svdend ...................***Timeout 334.21 sec ischedInit: The thread number has been automatically set to 256 Start 3042: mpi_dst_example_simple_lap_s_facto0_sched4_not_svdend Test #3043: mpi_dst_example_simple_lap_s_facto0_sched4_kway_svdbegin ................***Timeout 334.24 sec Start 3043: mpi_dst_example_simple_lap_s_facto0_sched4_kway_svdbegin Test #3044: mpi_dst_example_simple_lap_s_facto0_sched4_kway_svdend ..................***Timeout 334.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3044: mpi_dst_example_simple_lap_s_facto0_sched4_kway_svdend Test #3045: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_svdbegin .....***Timeout 334.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3045: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_svdbegin Test #3046: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_svdend .......***Timeout 334.32 sec Start 3046: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_svdend Test #3047: mpi_dst_example_simple_lap_s_facto0_sched4_not_pqrcpbegin ...............***Timeout 334.30 sec ischedInit: The thread number has been automatically set to 256 Start 3047: mpi_dst_example_simple_lap_s_facto0_sched4_not_pqrcpbegin Test #3048: mpi_dst_example_simple_lap_s_facto0_sched4_not_pqrcpend .................***Timeout 334.30 sec Start 3048: mpi_dst_example_simple_lap_s_facto0_sched4_not_pqrcpend Test #3050: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpend ................***Timeout 334.32 sec Start 3050: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpend Test #3051: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpbegin ...***Timeout 334.23 sec Start 3051: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpbegin Test #3052: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpend .....***Timeout 334.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3052: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpend Test #3053: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrcpbegin ...............***Timeout 334.13 sec Start 3053: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrcpbegin Test #3054: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrcpend .................***Timeout 334.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3054: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrcpend Test #3055: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrcpbegin ..............***Timeout 334.28 sec Start 3055: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrcpbegin Test #3056: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrcpend ................***Timeout 334.37 sec Start 3056: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrcpend Test #3058: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpend .....***Timeout 334.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3058: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpend Test #3059: mpi_dst_example_simple_lap_s_facto0_sched4_not_tqrcpbegin ...............***Timeout 334.37 sec ischedInit: The thread number has been automatically set to 256 Start 3059: mpi_dst_example_simple_lap_s_facto0_sched4_not_tqrcpbegin Test #3060: mpi_dst_example_simple_lap_s_facto0_sched4_not_tqrcpend .................***Timeout 334.35 sec Start 3060: mpi_dst_example_simple_lap_s_facto0_sched4_not_tqrcpend Test #3061: mpi_dst_example_simple_lap_s_facto0_sched4_kway_tqrcpbegin ..............***Timeout 334.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3061: mpi_dst_example_simple_lap_s_facto0_sched4_kway_tqrcpbegin Test #3062: mpi_dst_example_simple_lap_s_facto0_sched4_kway_tqrcpend ................***Timeout 334.29 sec Start 3062: mpi_dst_example_simple_lap_s_facto0_sched4_kway_tqrcpend 2755/3626 Test #3159: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpbegin ...***Timeout 333.77 sec Start 3159: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpbegin 2755/3626 Test #3161: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrrtbegin ...............***Timeout 333.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3161: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrrtbegin 2755/3626 Test #3162: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrrtend .................***Timeout 333.76 sec Start 3162: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrrtend 2755/3626 Test #3163: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrrtbegin ..............***Timeout 333.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3163: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrrtbegin 2755/3626 Test #3164: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrrtend ................***Timeout 313.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3164: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrrtend 2755/3626 Test #3179: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpbegin ...***Timeout 273.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.038897e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.443750e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.116671e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.949617e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.798352e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.697737e-02 s Time to initialize coeftab 9.598310e-02 s Time to factorize 1.290037e+00 s ( 4.06 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.584901e-01 s - iteration 1 : total iteration time 0.889 s error 2.2411e-14 Time for refinement 1.733252e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.241135e-14 max(|| b_i - A x_i ||_1) 4.449898e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.591680e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.241135e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.241135e-14 max(|| b_i - A x_i ||_1) 4.449898e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.591680e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 4.449898e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.591680e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.241135e-14 max(|| b_i - A x_i ||_1) 4.449898e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.591680e-02 (SUCCESS) Start 3179: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpbegin 2755/3626 Test #3186: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpend .....***Timeout 278.19 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.158223e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.486783e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.076924e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.864745e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.284463e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.642366e-01 s Time to initialize coeftab 5.630109e-02 s Time to factorize 1.347613e+00 s ( 3.88 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 3186: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpend Start 3258: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrrtend Start 3259: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrrtbegin Start 3260: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrrtend Start 3261: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtbegin Start 3262: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtend Start 3263: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpilu0 Start 3264: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpilu1 Start 3265: mpi_dst_example_simple_lap_c_facto1_sched4_not_svdbegin Start 3266: mpi_dst_example_simple_lap_c_facto1_sched4_not_svdend Test #2714: mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrrtend ................. Passed 139.82 sec Test #2720: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpilu1 ............... Passed 139.79 sec Test #2729: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpbegin .............. Passed 139.73 sec Test #2731: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpbegin ... Passed 139.72 sec Test #2732: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpend .....***Failed 139.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 [arch-nspawn-3655178:3691059] *** Process received signal *** [arch-nspawn-3655178:3691059] Signal: Segmentation fault (11) [arch-nspawn-3655178:3691059] Signal code: Address not mapped (1) [arch-nspawn-3655178:3691059] Failing at address: 0x7f431c7a1860 [arch-nspawn-3655178:3691097] *** Process received signal *** [arch-nspawn-3655178:3691059] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7f3dde6cc6cc] [arch-nspawn-3655178:3691059] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7f3dd44f3a02] [arch-nspawn-3655178:3691059] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7f3dd44f4504] [arch-nspawn-3655178:3691059] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7f3dd44a5a7a] [arch-nspawn-3655178:3691059] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7f3dd44d2aa2] [arch-nspawn-3655178:3691059] [ 5] /usr/lib/libmpi.so.40(+0x7de1a) [0x7f3dd4c7de1a] [arch-nspawn-3655178:3691059] [ 6] /usr/lib/libmpi.so.40(ompi_request_default_wait+0x1a) [0x7f3dd4c8019c] [arch-nspawn-3655178:3691059] [ 7] /usr/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0x98) [0x7f3dd4cf03e8] [arch-nspawn-3655178:3691059] [ 8] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_recursivedoubling+0x210) [0x7f3dd4cf1a88] [arch-nspawn-3655178:3691059] [ 9] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_ring+0x3fc) [0x7f3dd4cf443c] [arch-nspawn-3655178:3691059] [10] /usr/lib/libmpi.so.40(ompi_coll_tuned_allreduce_intra_dec_fixed+0x40) [0x7f3dd4d15152] [arch-nspawn-3655178:3691059] [11] /usr/lib/libmpi.so.40(MPI_Allreduce+0x294) [0x7f3dd4c8e584] [arch-nspawn-3655178:3691059] [12] /build/pastix/src/build/spm/src/libspm.so.1(spmUpdateComputedFields+0x140) [0x7f3ddd940458] [arch-nspawn-3655178:3691059] [13] /build/pastix/src/build/spm/src/libspm.so.1(genLaplacian+0xaa) [0x7f3ddd94921e] [arch-nspawn-3655178:3691059] [14] /build/pastix/src/build/spm/src/libspm.so.1(+0x409c8) [0x7f3ddd94a9c8] [arch-nspawn-3655178:3691059] [15] ./simple(+0xe2c) [0x555555556e2c] [arch-nspawn-3655178:3691059] [16] /usr/lib/libc.so.6(+0x27fae) [0x7f3dd4aa4fae] [arch-nspawn-3655178:3691059] [17] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7f3dd4aa50b8] [arch-nspawn-3655178:3691059] [18] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:3691059] *** End of error message *** [arch-nspawn-3655178:3691097] Signal: Segmentation fault (11) [arch-nspawn-3655178:3691097] Signal code: Address not mapped (1) [arch-nspawn-3655178:3691097] Failing at address: 0x7f21355a1860 [arch-nspawn-3655178:3691097] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7f2d473dc6cc] [arch-nspawn-3655178:3691097] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7f2d3d2f3a02] [arch-nspawn-3655178:3691097] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7f2d3d2f4504] [arch-nspawn-3655178:3691097] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7f2d3d2a5a7a] [arch-nspawn-3655178:3691097] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7f2d3d2d2aa2] [arch-nspawn-3655178:3691097] [ 5] /usr/lib/libmpi.so.40(+0x7de1a) [0x7f2d3d87de1a] [arch-nspawn-3655178:3691097] [ 6] /usr/lib/libmpi.so.40(ompi_request_default_wait+0x1a) [0x7f2d3d88019c] [arch-nspawn-3655178:3691097] [ 7] /usr/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0x98) [0x7f2d3d8f03e8] [arch-nspawn-3655178:3691097] [ 8] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_recursivedoubling+0x210) [0x7f2d3d8f1a88] [arch-nspawn-3655178:3691097] [ 9] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_ring+0x3fc) [0x7f2d3d8f443c] [arch-nspawn-3655178:3691097] [10] /usr/lib/libmpi.so.40(ompi_coll_tuned_allreduce_intra_dec_fixed+0x40) [0x7f2d3d915152] [arch-nspawn-3655178:3691097] [11] /usr/lib/libmpi.so.40(MPI_Allreduce+0x294) [0x7f2d3d88e584] [arch-nspawn-3655178:3691097] [12] /build/pastix/src/build/spm/src/libspm.so.1(spmUpdateComputedFields+0x140) [0x7f2d4731b458] [arch-nspawn-3655178:3691097] [13] /build/pastix/src/build/spm/src/libspm.so.1(genLaplacian+0xaa) [0x7f2d4732421e] [arch-nspawn-3655178:3691097] [14] /build/pastix/src/build/spm/src/libspm.so.1(+0x409c8) [0x7f2d473259c8] [arch-nspawn-3655178:3691097] [15] ./simple(+0xe2c) [0x555555556e2c] [arch-nspawn-3655178:3691097] [16] /usr/lib/libc.so.6(+0x27fae) [0x7f2d464a3fae] [arch-nspawn-3655178:3691097] [17] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7f2d464a40b8] [arch-nspawn-3655178:3691097] [18] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:3691097] *** End of error message *** -------------------------------------------------------------------------- prte noticed that process rank 2 with PID 3691059 on node arch-nspawn-3655178 exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- Test #2757: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_svdbegin ..... Passed 139.54 sec Test #2759: mpi_dst_example_simple_lap_c_facto1_sched1_not_pqrcpbegin ............... Passed 139.55 sec Test #2762: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpend ................ Passed 139.53 sec Test #2768: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrcpend ................ Passed 139.49 sec Test #2771: mpi_dst_example_simple_lap_c_facto1_sched1_not_tqrcpbegin ............... Passed 139.51 sec Test #2772: mpi_dst_example_simple_lap_c_facto1_sched1_not_tqrcpend ................. Passed 139.51 sec Test #2774: mpi_dst_example_simple_lap_c_facto1_sched1_kway_tqrcpend ................ Passed 139.53 sec Test #2786: mpi_dst_example_simple_lap_c_facto2_sched1_not_svdend ................... Passed 139.48 sec Test #2794: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpend ................ Passed 139.45 sec Test #2796: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpend ..... Passed 139.38 sec Test #2805: mpi_dst_example_simple_lap_c_facto2_sched1_kway_tqrcpbegin .............. Passed 138.89 sec Test #2826: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpend ................ Passed 138.47 sec Test #2873: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrrtbegin ............... Passed 136.17 sec 2773/3626 Test #3166: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtend .....***Timeout 326.21 sec ischedInit: The thread number has been automatically set to 256 Start 3166: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtend 2773/3626 Test #3167: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpilu0 ...............***Timeout 325.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3167: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpilu0 Test #2897: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpbegin ...***Timeout 324.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.093289e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.196267e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.352327e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.528147e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.331292e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.341416e-01 s Time to initialize coeftab 3.770303e-01 s 2774/3626 Test #3168: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpilu1 ...............***Timeout 322.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3168: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpilu1 2774/3626 Test #3169: mpi_dst_example_simple_lap_d_facto1_sched4_not_svdbegin .................***Timeout 314.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3169: mpi_dst_example_simple_lap_d_facto1_sched4_not_svdbegin 2774/3626 Test #3170: mpi_dst_example_simple_lap_d_facto1_sched4_not_svdend ...................***Timeout 312.06 sec ischedInit: The thread number has been automatically set to 256 Start 3170: mpi_dst_example_simple_lap_d_facto1_sched4_not_svdend 2774/3626 Test #3174: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_svdend .......***Timeout 299.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3174: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_svdend 2774/3626 Test #3175: mpi_dst_example_simple_lap_d_facto1_sched4_not_pqrcpbegin ...............***Timeout 293.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3175: mpi_dst_example_simple_lap_d_facto1_sched4_not_pqrcpbegin 2774/3626 Test #3176: mpi_dst_example_simple_lap_d_facto1_sched4_not_pqrcpend .................***Timeout 291.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3176: mpi_dst_example_simple_lap_d_facto1_sched4_not_pqrcpend 2774/3626 Test #3177: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpbegin ..............***Timeout 291.13 sec ischedInit: The thread number has been automatically set to 256 Start 3177: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpbegin 2774/3626 Test #3178: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpend ................***Timeout 287.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Start 3178: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpend 2774/3626 Test #3180: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpend .....***Timeout 285.50 sec ischedInit: The thread number has been automatically set to 256 Start 3180: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpend 2774/3626 Test #3181: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrcpbegin ...............***Timeout 285.46 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3181: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrcpbegin 2774/3626 Test #3182: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrcpend .................***Timeout 284.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3182: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrcpend 2774/3626 Test #3183: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrcpbegin ..............***Timeout 283.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3183: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrcpbegin Start 3267: mpi_dst_example_simple_lap_c_facto1_sched4_kway_svdbegin Start 3268: mpi_dst_example_simple_lap_c_facto1_sched4_kway_svdend Start 3269: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_svdbegin Start 3270: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_svdend Start 3271: mpi_dst_example_simple_lap_c_facto1_sched4_not_pqrcpbegin Start 3272: mpi_dst_example_simple_lap_c_facto1_sched4_not_pqrcpend Start 3273: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpbegin Start 3274: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpend Start 3275: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpbegin Start 3276: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpend Start 3277: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrcpbegin Start 3278: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrcpend Start 3279: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrcpbegin Start 3280: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrcpend Start 3281: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpbegin Start 3282: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpend Start 3283: mpi_dst_example_simple_lap_c_facto1_sched4_not_tqrcpbegin Start 3284: mpi_dst_example_simple_lap_c_facto1_sched4_not_tqrcpend Start 3285: mpi_dst_example_simple_lap_c_facto1_sched4_kway_tqrcpbegin Test #2716: mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrrtend ................ Passed 141.43 sec Test #2717: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtbegin ... Passed 141.43 sec Test #2718: mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrrtend ..... Passed 141.43 sec Test #2719: mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpilu0 ............... Passed 141.43 sec Test #2721: mpi_dst_example_simple_lap_c_facto0_sched1_not_svdbegin ................. Passed 141.41 sec Test #2722: mpi_dst_example_simple_lap_c_facto0_sched1_not_svdend ................... Passed 141.40 sec Test #2728: mpi_dst_example_simple_lap_c_facto0_sched1_not_pqrcpend ................. Passed 141.38 sec Test #2734: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrcpend ................. Passed 141.34 sec Test #2736: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrcpend ................ Passed 141.32 sec Test #2742: mpi_dst_example_simple_lap_c_facto0_sched1_kway_tqrcpend ................ Passed 141.29 sec Test #2745: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrrtbegin ............... Passed 141.27 sec Test #2746: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrrtend ................. Passed 141.27 sec Test #2750: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtend ..... Passed 141.25 sec Test #2752: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpilu1 ............... Passed 141.22 sec Test #2753: mpi_dst_example_simple_lap_c_facto1_sched1_not_svdbegin ................. Passed 141.21 sec Test #2754: mpi_dst_example_simple_lap_c_facto1_sched1_not_svdend ................... Passed 141.20 sec Test #2760: mpi_dst_example_simple_lap_c_facto1_sched1_not_pqrcpend ................. Passed 141.15 sec Test #2764: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpend ..... Passed 141.08 sec Test #2766: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrcpend ................. Passed 141.07 sec Test #2777: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrrtbegin ............... Passed 141.00 sec Test #2778: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrrtend ................. Passed 140.99 sec Test #2779: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrrtbegin .............. Passed 140.99 sec Test #2787: mpi_dst_example_simple_lap_c_facto2_sched1_kway_svdbegin ................ Passed 140.94 sec Test #2789: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_svdbegin ..... Passed 140.93 sec Test #2792: mpi_dst_example_simple_lap_c_facto2_sched1_not_pqrcpend ................. Passed 140.89 sec Test #2793: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpbegin .............. Passed 140.88 sec Test #2797: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrcpbegin ............... Passed 140.80 sec Test #2798: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrcpend ................. Passed 140.80 sec Test #2799: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrcpbegin .............. Passed 140.80 sec Test #2800: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrcpend ................ Passed 140.79 sec Test #2801: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpbegin ... Passed 140.67 sec Test #2804: mpi_dst_example_simple_lap_c_facto2_sched1_not_tqrcpend ................. Passed 140.28 sec Test #2828: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpend ..... Passed 139.80 sec Test #2829: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrcpbegin ............... Passed 139.80 sec Test #2832: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrcpend ................ Passed 139.74 sec Test #2838: mpi_dst_example_simple_lap_c_facto3_sched1_kway_tqrcpend ................ Passed 139.48 sec Test #2868: mpi_dst_example_simple_lap_c_facto4_sched1_not_tqrcpend ................. Passed 137.97 sec Test #2874: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrrtend ................. Passed 137.45 sec Test #2878: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtend ..... Passed 137.23 sec Test #2888: mpi_dst_example_simple_lap_z_facto0_sched1_not_pqrcpend ................. Passed 136.37 sec Test #2890: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpend ................ Passed 136.24 sec Test #2896: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrcpend ................ Passed 136.01 sec Test #2900: mpi_dst_example_simple_lap_z_facto0_sched1_not_tqrcpend ................. Passed 135.86 sec Start 3286: mpi_dst_example_simple_lap_c_facto1_sched4_kway_tqrcpend Start 3287: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpbegin Start 3288: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpend Start 3289: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrrtbegin Start 3290: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrrtend Start 3291: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrrtbegin Start 3292: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrrtend Start 3293: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtbegin Start 3294: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtend Start 3295: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpilu0 Start 3296: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpilu1 Start 3297: mpi_dst_example_simple_lap_c_facto2_sched4_not_svdbegin Start 3298: mpi_dst_example_simple_lap_c_facto2_sched4_not_svdend Start 3299: mpi_dst_example_simple_lap_c_facto2_sched4_kway_svdbegin Start 3300: mpi_dst_example_simple_lap_c_facto2_sched4_kway_svdend Start 3301: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_svdbegin Start 3302: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_svdend Start 3303: mpi_dst_example_simple_lap_c_facto2_sched4_not_pqrcpbegin Start 3304: mpi_dst_example_simple_lap_c_facto2_sched4_not_pqrcpend Start 3305: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpbegin Start 3306: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpend Start 3307: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpbegin Start 3308: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpend Start 3309: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrcpbegin Start 3310: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrcpend Start 3311: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrcpbegin Start 3312: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrcpend Start 3313: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpbegin Start 3314: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpend Start 3315: mpi_dst_example_simple_lap_c_facto2_sched4_not_tqrcpbegin Start 3316: mpi_dst_example_simple_lap_c_facto2_sched4_not_tqrcpend Start 3317: mpi_dst_example_simple_lap_c_facto2_sched4_kway_tqrcpbegin Start 3318: mpi_dst_example_simple_lap_c_facto2_sched4_kway_tqrcpend Start 3319: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpbegin Start 3320: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpend Start 3321: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrrtbegin Start 3322: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrrtend Start 3323: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrrtbegin Start 3324: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrrtend Start 3325: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtbegin Start 3326: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtend Start 3327: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpilu0 Start 3328: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpilu1 Test #2795: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_pqrcpbegin ... Passed 142.62 sec Test #2802: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrcpend ..... Passed 142.20 sec Test #2844: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrrtend ................ Passed 141.01 sec Test #2856: mpi_dst_example_simple_lap_c_facto4_sched1_not_pqrcpend ................. Passed 140.19 sec Test #3120: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrcpend ................ Passed 109.54 sec Start 3329: mpi_dst_example_simple_lap_c_facto3_sched4_not_svdbegin Start 3330: mpi_dst_example_simple_lap_c_facto3_sched4_not_svdend Start 3331: mpi_dst_example_simple_lap_c_facto3_sched4_kway_svdbegin Start 3332: mpi_dst_example_simple_lap_c_facto3_sched4_kway_svdend Start 3333: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_svdbegin Test #3136: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpilu1 ............... Passed 105.80 sec Start 3334: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_svdend Test #3097: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrrtbegin ............... Passed 113.19 sec Start 3335: mpi_dst_example_simple_lap_c_facto3_sched4_not_pqrcpbegin Test #2803: mpi_dst_example_simple_lap_c_facto2_sched1_not_tqrcpbegin ............... Passed 143.18 sec Start 3336: mpi_dst_example_simple_lap_c_facto3_sched4_not_pqrcpend Test #2765: mpi_dst_example_simple_lap_c_facto1_sched1_not_rqrcpbegin ............... Passed 143.98 sec Start 3337: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpbegin Test #2855: mpi_dst_example_simple_lap_c_facto4_sched1_not_pqrcpbegin ...............***Failed 142.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 [arch-nspawn-3655178:3705086] *** Process received signal *** [arch-nspawn-3655178:3705086] Signal: Segmentation fault (11) [arch-nspawn-3655178:3705086] Signal code: Address not mapped (1) [arch-nspawn-3655178:3705086] Failing at address: 0x7f431c7a1860 [arch-nspawn-3655178:3705086] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7f21473d46cc] [arch-nspawn-3655178:3705086] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7f21467a3a02] [arch-nspawn-3655178:3705086] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7f21467a4504] [arch-nspawn-3655178:3705086] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7f2146755a7a] [arch-nspawn-3655178:3705086] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7f2146782aa2] [arch-nspawn-3655178:3705086] [ 5] /usr/lib/libmpi.so.40(+0x7de1a) [0x7f2146e7de1a] [arch-nspawn-3655178:3705086] [ 6] /usr/lib/libmpi.so.40(ompi_request_default_wait+0x1a) [0x7f2146e8019c] [arch-nspawn-3655178:3705086] [ 7] /usr/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0x98) [0x7f2146ef03e8] [arch-nspawn-3655178:3705086] [ 8] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_recursivedoubling+0x210) [0x7f2146ef1a88] [arch-nspawn-3655178:3705086] [ 9] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_ring+0x3fc) [0x7f2146ef443c] [arch-nspawn-3655178:3705086] [10] /usr/lib/libmpi.so.40(ompi_coll_tuned_allreduce_intra_dec_fixed+0x40) [0x7f2146f15152] [arch-nspawn-3655178:3705086] [11] /usr/lib/libmpi.so.40(MPI_Allreduce+0x294) [0x7f2146e8e584] [arch-nspawn-3655178:3705086] [12] /build/pastix/src/build/spm/src/libspm.so.1(spmUpdateComputedFields+0x140) [0x7f2147313458] [arch-nspawn-3655178:3705086] [13] /build/pastix/src/build/spm/src/libspm.so.1(genLaplacian+0xaa) [0x7f214731c21e] [arch-nspawn-3655178:3705086] [14] /build/pastix/src/build/spm/src/libspm.so.1(+0x409c8) [0x7f214731d9c8] [arch-nspawn-3655178:3705086] [15] ./simple(+0xe2c) [0x555555556e2c] [arch-nspawn-3655178:3705086] [16] /usr/lib/libc.so.6(+0x27fae) [0x7f2147181fae] [arch-nspawn-3655178:3705086] [17] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7f21471820b8] [arch-nspawn-3655178:3705086] [18] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:3705086] *** End of error message *** [arch-nspawn-3655178:3705070] *** Process received signal *** [arch-nspawn-3655178:3705070] Signal: Segmentation fault (11) [arch-nspawn-3655178:3705070] Signal code: Address not mapped (1) [arch-nspawn-3655178:3705070] Failing at address: 0x7f21355a1860 [arch-nspawn-3655178:3705070] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7f432d9cc6cc] [arch-nspawn-3655178:3705070] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7f431f7a3a02] [arch-nspawn-3655178:3705070] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7f431f7a4504] [arch-nspawn-3655178:3705070] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7f431f755a7a] [arch-nspawn-3655178:3705070] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7f431f782aa2] [arch-nspawn-3655178:3705070] [ 5] /usr/lib/libmpi.so.40(+0x7de1a) [0x7f431fc7de1a] [arch-nspawn-3655178:3705070] [ 6] /usr/lib/libmpi.so.40(ompi_request_default_wait+0x1a) [0x7f431fc8019c] [arch-nspawn-3655178:3705070] [ 7] /usr/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0x98) [0x7f431fcf03e8] [arch-nspawn-3655178:3705070] [ 8] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_recursivedoubling+0x210) [0x7f431fcf1a88] [arch-nspawn-3655178:3705070] [ 9] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_ring+0x3fc) [0x7f431fcf443c] [arch-nspawn-3655178:3705070] [10] /usr/lib/libmpi.so.40(ompi_coll_tuned_allreduce_intra_dec_fixed+0x40) [0x7f431fd15152] [arch-nspawn-3655178:3705070] [11] /usr/lib/libmpi.so.40(MPI_Allreduce+0x294) [0x7f431fc8e584] [arch-nspawn-3655178:3705070] [12] /build/pastix/src/build/spm/src/libspm.so.1(spmUpdateComputedFields+0x140) [0x7f432d90b458] [arch-nspawn-3655178:3705070] [13] /build/pastix/src/build/spm/src/libspm.so.1(genLaplacian+0xaa) [0x7f432d91421e] [arch-nspawn-3655178:3705070] [14] /build/pastix/src/build/spm/src/libspm.so.1(+0x409c8) [0x7f432d9159c8] [arch-nspawn-3655178:3705070] [15] ./simple(+0xe2c) [0x555555556e2c] [arch-nspawn-3655178:3705070] [16] /usr/lib/libc.so.6(+0x27fae) [0x7f432caa3fae] [arch-nspawn-3655178:3705070] [17] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7f432caa40b8] [arch-nspawn-3655178:3705070] [18] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:3705070] *** End of error message *** -------------------------------------------------------------------------- prte noticed that process rank 2 with PID 3705070 on node arch-nspawn-3655178 exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- Start 3338: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpend Test #2730: mpi_dst_example_simple_lap_c_facto0_sched1_kway_pqrcpend ................ Passed 145.73 sec Start 3339: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpbegin Test #2726: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_svdend ....... Passed 145.96 sec Start 3340: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpend Test #3158: mpi_dst_example_simple_lap_d_facto0_sched4_kway_tqrcpend ................ Passed 105.62 sec Start 3341: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrcpbegin Test #3106: mpi_dst_example_simple_lap_s_facto2_sched4_not_svdend ................... Passed 114.84 sec Start 3342: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrcpend Test #2808: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpend ..... Passed 144.92 sec Start 3343: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrcpbegin Test #2776: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_tqrcpend ..... Passed 145.97 sec Start 3344: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrcpend Test #2791: mpi_dst_example_simple_lap_c_facto2_sched1_not_pqrcpbegin ............... Passed 145.93 sec Start 3345: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpbegin Test #2763: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_pqrcpbegin ... Passed 146.71 sec Start 3346: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpend Test #2761: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpbegin .............. Passed 147.14 sec Start 3347: mpi_dst_example_simple_lap_c_facto3_sched4_not_tqrcpbegin Test #2871: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpbegin ... Passed 143.88 sec Start 3348: mpi_dst_example_simple_lap_c_facto3_sched4_not_tqrcpend Test #2902: mpi_dst_example_simple_lap_z_facto0_sched1_kway_tqrcpend ................ Passed 142.53 sec Start 3349: mpi_dst_example_simple_lap_c_facto3_sched4_kway_tqrcpbegin Test #2784: mpi_dst_example_simple_lap_c_facto1_sched1_kway_pqrcpilu1 ............... Passed 147.89 sec Start 3350: mpi_dst_example_simple_lap_c_facto3_sched4_kway_tqrcpend Test #3114: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpend ................ Passed 115.80 sec Start 3351: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpbegin Test #2785: mpi_dst_example_simple_lap_c_facto2_sched1_not_svdbegin ................. Passed 148.65 sec Start 3352: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpend Test #2756: mpi_dst_example_simple_lap_c_facto1_sched1_kway_svdend .................. Passed 149.07 sec Start 3353: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrrtbegin Test #2866: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpend ..... Passed 145.98 sec Start 3354: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrrtend Test #2748: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrrtend ................ Passed 149.59 sec Start 3355: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrrtbegin Test #2830: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrcpend ................. Passed 148.48 sec Start 3356: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrrtend Test #2806: mpi_dst_example_simple_lap_c_facto2_sched1_kway_tqrcpend ................ Passed 149.14 sec Start 3357: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtbegin Test #2781: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrrtbegin ... Passed 151.58 sec Start 3358: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtend Test #2758: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_svdend ....... Passed 152.09 sec Start 3359: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpilu0 Test #2744: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpend ..... Passed 152.61 sec Start 3360: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpilu1 Test #3152: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrcpend ................ Passed 113.55 sec Start 3361: mpi_dst_example_simple_lap_c_facto4_sched4_not_svdbegin Test #3086: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrcpend ................. Passed 125.08 sec Start 3362: mpi_dst_example_simple_lap_c_facto4_sched4_not_svdend Test #2769: mpi_dst_example_simple_lap_c_facto1_sched1_kwayprojections_rqrcpbegin ... Passed 155.84 sec Start 3363: mpi_dst_example_simple_lap_c_facto4_sched4_kway_svdbegin Test #2735: mpi_dst_example_simple_lap_c_facto0_sched1_kway_rqrcpbegin .............. Passed 158.82 sec Start 3364: mpi_dst_example_simple_lap_c_facto4_sched4_kway_svdend Test #2850: mpi_dst_example_simple_lap_c_facto4_sched1_not_svdend ................... Passed 156.95 sec Start 3365: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_svdbegin Test #2815: mpi_dst_example_simple_lap_c_facto2_sched1_kway_pqrcpilu0 ............... Passed 158.08 sec Start 3366: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_svdend Test #2887: mpi_dst_example_simple_lap_z_facto0_sched1_not_pqrcpbegin ............... Passed 154.60 sec Start 3367: mpi_dst_example_simple_lap_c_facto4_sched4_not_pqrcpbegin Test #2879: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpilu0 ............... Passed 155.37 sec Start 3368: mpi_dst_example_simple_lap_c_facto4_sched4_not_pqrcpend Test #2884: mpi_dst_example_simple_lap_z_facto0_sched1_kway_svdend .................. Passed 155.04 sec Test #3089: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpbegin ... Passed 129.82 sec Start 3369: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpbegin Start 3370: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpend Test #2755: mpi_dst_example_simple_lap_c_facto1_sched1_kway_svdbegin ................ Passed 160.02 sec Start 3371: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpbegin Test #2831: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrcpbegin .............. Passed 158.71 sec Test #2741: mpi_dst_example_simple_lap_c_facto0_sched1_kway_tqrcpbegin .............. Passed 160.52 sec Start 3372: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpend Start 3373: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrcpbegin Test #2823: mpi_dst_example_simple_lap_c_facto3_sched1_not_pqrcpbegin ............... Passed 159.15 sec Start 3374: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrcpend Test #2737: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrcpbegin ... Passed 160.75 sec Start 3375: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrcpbegin Test #2893: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrcpbegin ............... Passed 156.65 sec Start 3376: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrcpend Test #2773: mpi_dst_example_simple_lap_c_facto1_sched1_kway_tqrcpbegin .............. Passed 161.70 sec Start 3377: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpbegin Test #2818: mpi_dst_example_simple_lap_c_facto3_sched1_not_svdend ................... Passed 161.27 sec Start 3378: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpend Test #2780: mpi_dst_example_simple_lap_c_facto1_sched1_kway_rqrrtend ................ Passed 162.62 sec Start 3379: mpi_dst_example_simple_lap_c_facto4_sched4_not_tqrcpbegin Test #2869: mpi_dst_example_simple_lap_c_facto4_sched1_kway_tqrcpbegin .............. Passed 159.66 sec Start 3380: mpi_dst_example_simple_lap_c_facto4_sched4_not_tqrcpend Test #2749: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_rqrrtbegin ... Passed 163.48 sec Start 3381: mpi_dst_example_simple_lap_c_facto4_sched4_kway_tqrcpbegin Test #2875: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrrtbegin .............. Passed 159.56 sec Start 3382: mpi_dst_example_simple_lap_c_facto4_sched4_kway_tqrcpend Test #2827: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_pqrcpbegin ... Passed 162.78 sec Start 3383: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpbegin Test #2886: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_svdend ....... Passed 159.55 sec Start 3384: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpend Test #2849: mpi_dst_example_simple_lap_c_facto4_sched1_not_svdbegin ................. Passed 162.15 sec Start 3385: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrrtbegin Test #3084: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpend ..... Passed 136.33 sec Start 3386: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrrtend Test #2739: mpi_dst_example_simple_lap_c_facto0_sched1_not_tqrcpbegin ............... Passed 165.38 sec Start 3387: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrrtbegin Test #2833: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpbegin ... Passed 164.38 sec Start 3388: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrrtend Test #2901: mpi_dst_example_simple_lap_z_facto0_sched1_kway_tqrcpbegin .............. Passed 160.81 sec Start 3389: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtbegin Test #2727: mpi_dst_example_simple_lap_c_facto0_sched1_not_pqrcpbegin ............... Passed 166.77 sec Start 3390: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtend Test #2854: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_svdend ....... Passed 165.77 sec Start 3391: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpilu0 Test #2743: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_tqrcpbegin ... Passed 169.25 sec Start 3392: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpilu1 Test #2872: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_tqrcpend ..... Passed 165.39 sec Start 3393: mpi_dst_example_simple_lap_z_facto0_sched4_not_svdbegin Test #2733: mpi_dst_example_simple_lap_c_facto0_sched1_not_rqrcpbegin ............... Passed 169.39 sec Start 3394: mpi_dst_example_simple_lap_z_facto0_sched4_not_svdend Test #2903: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpbegin ... Passed 164.32 sec Start 3395: mpi_dst_example_simple_lap_z_facto0_sched4_kway_svdbegin Test #2885: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_svdbegin ..... Passed 165.83 sec Start 3396: mpi_dst_example_simple_lap_z_facto0_sched4_kway_svdend Test #2725: mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_svdbegin ..... Passed 171.09 sec Start 3397: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_svdbegin Test #2817: mpi_dst_example_simple_lap_c_facto3_sched1_not_svdbegin ................. Passed 170.18 sec Start 3398: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_svdend Test #2889: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpbegin .............. Passed 166.94 sec Start 3399: mpi_dst_example_simple_lap_z_facto0_sched4_not_pqrcpbegin Test #2867: mpi_dst_example_simple_lap_c_facto4_sched1_not_tqrcpbegin ............... Passed 168.86 sec Start 3400: mpi_dst_example_simple_lap_z_facto0_sched4_not_pqrcpend Test #2892: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpend ..... Passed 167.64 sec Start 3401: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpbegin Test #2842: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrrtend ................. Passed 171.69 sec Start 3402: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpend Test #2839: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_tqrcpbegin ... Passed 173.42 sec Start 3403: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpbegin Test #2895: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrcpbegin .............. Passed 170.16 sec Start 3404: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpend Test #2821: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_svdbegin ..... Passed 174.65 sec Start 3405: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrcpbegin Test #2911: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpilu0 ............... Passed 153.90 sec Start 3406: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrcpend Test #3066: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrrtend ................. Passed 152.52 sec Start 3407: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrcpbegin Test #2819: mpi_dst_example_simple_lap_c_facto3_sched1_kway_svdbegin ................ Passed 177.42 sec Start 3408: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrcpend Test #2841: mpi_dst_example_simple_lap_c_facto3_sched1_not_rqrrtbegin ............... Passed 178.70 sec Start 3409: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpbegin Test #2883: mpi_dst_example_simple_lap_z_facto0_sched1_kway_svdbegin ................ Passed 177.45 sec Start 3410: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpend Test #2864: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrcpend ................ Passed 186.91 sec Start 3411: mpi_dst_example_simple_lap_z_facto0_sched4_not_tqrcpbegin Test #2880: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpilu1 ............... Passed 191.79 sec Start 3412: mpi_dst_example_simple_lap_z_facto0_sched4_not_tqrcpend Test #2881: mpi_dst_example_simple_lap_z_facto0_sched1_not_svdbegin ................. Passed 194.10 sec Start 3413: mpi_dst_example_simple_lap_z_facto0_sched4_kway_tqrcpbegin Test #2807: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpbegin ...***Timeout 228.27 sec Test #2810: mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrrtend .................***Timeout 228.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2811: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrrtbegin ..............***Timeout 228.13 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2812: mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrrtend ................***Timeout 228.07 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2814: mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtend .....***Timeout 228.06 sec Test #2834: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpend .....***Timeout 227.65 sec ischedInit: The thread number has been automatically set to 256 Test #2835: mpi_dst_example_simple_lap_c_facto3_sched1_not_tqrcpbegin ...............***Timeout 227.57 sec ischedInit: The thread number has been automatically set to 256 Test #2836: mpi_dst_example_simple_lap_c_facto3_sched1_not_tqrcpend .................***Timeout 227.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2837: mpi_dst_example_simple_lap_c_facto3_sched1_kway_tqrcpbegin ..............***Timeout 227.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2843: mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrrtbegin ..............***Timeout 227.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2846: mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtend .....***Timeout 227.26 sec Test #2847: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpilu0 ...............***Timeout 227.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2848: mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpilu1 ...............***Timeout 227.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2851: mpi_dst_example_simple_lap_c_facto4_sched1_kway_svdbegin ................***Timeout 226.83 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2853: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_svdbegin .....***Timeout 226.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2858: mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpend ................***Timeout 226.44 sec Test #2860: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpend .....***Timeout 226.39 sec Test #2862: mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrcpend .................***Timeout 226.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2863: mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrcpbegin ..............***Timeout 226.24 sec Test #2865: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpbegin ...***Timeout 226.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2870: mpi_dst_example_simple_lap_c_facto4_sched1_kway_tqrcpend ................***Timeout 225.71 sec ischedInit: The thread number has been automatically set to 256 Test #2877: mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtbegin ...***Timeout 225.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2882: mpi_dst_example_simple_lap_z_facto0_sched1_not_svdend ...................***Timeout 224.88 sec ischedInit: The thread number has been automatically set to 256 Test #2891: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpbegin ...***Timeout 224.14 sec Test #2894: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrcpend .................***Timeout 224.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2898: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpend .....***Timeout 223.98 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2899: mpi_dst_example_simple_lap_z_facto0_sched1_not_tqrcpbegin ...............***Timeout 223.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.294804e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.226663e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.150067e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.931292e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.792727e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.017124e-01 s Time to initialize coeftab 3.286213e-01 s Time to factorize 7.091035e+00 s ( 2.86 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Test #3074: mpi_dst_example_simple_lap_s_facto1_sched4_not_svdend ................... Passed 203.73 sec Test #3087: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrcpbegin ..............***Timeout 207.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.304382e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.960019e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.819665e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.145449e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.833535e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.459796e-01 s Time to initialize coeftab 2.409277e-01 s Time to factorize 2.197276e+00 s ( 2.38 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 1.033944e+00 s - iteration 1 : total iteration time 1.2 s error 3.3031e-11 Time for refinement 2.902735e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.899420e-08 max(|| b_i - A x_i ||_1) 2.965326e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.726200e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.899420e-08 max(|| b_i - A x_i ||_1) 2.965326e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.726200e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.899420e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.899420e-08 max(|| b_i - A x_i ||_1) 2.965326e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.726200e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.965326e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.726200e-01 (SUCCESS) Start 3087: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrcpbegin Test #3093: mpi_dst_example_simple_lap_s_facto1_sched4_kway_tqrcpbegin ..............***Timeout 209.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.935721e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.048836e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.683186e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.463811e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.286480e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.350940e-01 s Time to initialize coeftab 1.627818e-01 s Time to factorize 2.421148e+00 s ( 2.16 MFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 6.152068e-01 s - iteration 1 : total iteration time 0.56 s error 3.5071e-11 Time for refinement 1.457132e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.557614e-08 max(|| b_i - A x_i ||_1) 2.791574e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.507864e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.557614e-08 max(|| b_i - A x_i ||_1) 2.791574e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.507864e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.557614e-08 max(|| b_i - A x_i ||_1) 2.791574e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.507864e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.557614e-08 max(|| b_i - A x_i ||_1) 2.791574e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.507864e-01 (SUCCESS) Start 3093: mpi_dst_example_simple_lap_s_facto1_sched4_kway_tqrcpbegin Test #3131: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrrtbegin ..............***Timeout 212.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.547610e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.876481e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.356643e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.380941e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.200437e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.740719e-02 s Time to initialize coeftab 1.016513e-01 s Time to factorize 1.371354e+00 s ( 7.28 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.132064e+00 s - iteration 1 : total iteration time 1.94 s error 7.031e-11 Time for refinement 2.590929e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.076347e-08 max(|| b_i - A x_i ||_1) 2.967027e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.728337e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.076347e-08 max(|| b_i - A x_i ||_1) 2.967027e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.728337e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.076347e-08 max(|| b_i - A x_i ||_1) 2.967027e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.728337e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.076347e-08 max(|| b_i - A x_i ||_1) 2.967027e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.728337e-01 (SUCCESS) Start 3131: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrrtbegin 2930/3626 Test #3210: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpend ................***Timeout 165.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.282826e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.719108e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.673734e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.089141e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.098728e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.381601e-01 s Time to initialize coeftab 3.524546e-02 s Time to factorize 1.183085e+00 s ( 8.44 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 8.210759e-01 s - iteration 1 : total iteration time 1.1 s error 1.5286e-15 Time for refinement 1.980480e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.528125e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.528125e-15 max(|| b_i - A x_i ||_1) 2.229769e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.801896e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.229769e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.801896e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.528125e-15 max(|| b_i - A x_i ||_1) 2.229769e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.801896e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.528125e-15 max(|| b_i - A x_i ||_1) 2.229769e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.801896e-03 (SUCCESS) Start 3210: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpend 2930/3626 Test #3217: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpbegin ...***Timeout 169.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.605444e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.210746e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.404680e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.018580e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.151341e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.177074e-02 s Time to initialize coeftab 1.788453e-01 s Time to factorize 2.830641e+00 s ( 3.53 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Time to solve 6.511883e-01 s - iteration 1 : total iteration time 0.928 s error 3.3576e-14 Time for refinement 1.955470e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.357670e-14 max(|| b_i - A x_i ||_1) 6.093460e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.656956e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.357670e-14 max(|| b_i - A x_i ||_1) 6.093460e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.656956e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.357670e-14 max(|| b_i - A x_i ||_1) 6.093460e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.656956e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.357670e-14 max(|| b_i - A x_i ||_1) 6.093460e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.656956e-02 (SUCCESS) Start 3217: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpbegin 2930/3626 Test #3252: mpi_dst_example_simple_lap_c_facto0_sched4_not_tqrcpend .................***Timeout 394.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.460845e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.248431e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.670975e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.478388e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.085004e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.202393e-01 s Time to initialize coeftab 4.435256e-02 s Time to factorize 4.114094e+00 s ( 4.93 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.234848e-01 s Time for refinement 3.500841e-01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.020421e-07 max(|| b_i - A x_i ||_1) 1.474137e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.719761e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.020421e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.020421e-07 max(|| b_i - A x_i ||_1) 1.474137e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.719761e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.020421e-07 max(|| b_i - A x_i ||_1) 1.474137e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.719761e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.474137e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.719761e+00 (SUCCESS) Start 3252: mpi_dst_example_simple_lap_c_facto0_sched4_not_tqrcpend Start 3414: mpi_dst_example_simple_lap_z_facto0_sched4_kway_tqrcpend Start 3415: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpbegin Start 3416: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpend Start 3417: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrrtbegin Start 3418: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrrtend Start 3419: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrrtbegin Start 3420: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrrtend Start 3421: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtbegin Start 3422: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtend Start 3423: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpilu0 Start 3424: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpilu1 Start 3425: mpi_dst_example_simple_lap_z_facto1_sched4_not_svdbegin Start 3426: mpi_dst_example_simple_lap_z_facto1_sched4_not_svdend Start 3427: mpi_dst_example_simple_lap_z_facto1_sched4_kway_svdbegin Start 3428: mpi_dst_example_simple_lap_z_facto1_sched4_kway_svdend Start 3429: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_svdbegin Start 3430: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_svdend Start 3431: mpi_dst_example_simple_lap_z_facto1_sched4_not_pqrcpbegin Start 3432: mpi_dst_example_simple_lap_z_facto1_sched4_not_pqrcpend Start 3433: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpbegin Start 3434: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpend Start 3435: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpbegin Start 3436: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpend Start 3437: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrcpbegin Start 3438: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrcpend Start 3439: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrcpbegin Start 3440: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrcpend Start 3441: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpbegin Test #2984: mpi_dst_example_simple_lap_z_facto3_sched1_not_pqrcpend ................. Passed 396.94 sec Test #2906: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrrtend .................***Timeout 494.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2908: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrrtend ................***Timeout 492.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2909: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtbegin ...***Timeout 492.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.293451e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.364361e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.798782e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.771602e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.069108e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.012387e-01 s Time to initialize coeftab 1.101240e-01 s Time to factorize 3.942420e+00 s ( 5.14 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Test #3065: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrrtbegin ...............***Timeout 488.58 sec Start 3065: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrrtbegin Test #3067: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrrtbegin ..............***Timeout 488.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3067: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrrtbegin Test #3068: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrrtend ................***Timeout 488.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3068: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrrtend Test #3069: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtbegin ...***Timeout 488.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3069: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtbegin Test #3070: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtend .....***Timeout 488.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3070: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtend Test #3071: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpilu0 ...............***Timeout 488.74 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3071: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpilu0 Test #3072: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpilu1 ...............***Timeout 488.97 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3072: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpilu1 Test #3073: mpi_dst_example_simple_lap_s_facto1_sched4_not_svdbegin .................***Timeout 489.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3073: mpi_dst_example_simple_lap_s_facto1_sched4_not_svdbegin Test #3081: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpbegin ..............***Timeout 486.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3081: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpbegin Test #3082: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpend ................***Timeout 486.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3082: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpend Test #3083: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpbegin ...***Timeout 486.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3083: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpbegin Test #3088: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrcpend ................***Timeout 485.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3088: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrcpend Test #3090: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpend .....***Timeout 485.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3090: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpend Test #3092: mpi_dst_example_simple_lap_s_facto1_sched4_not_tqrcpend .................***Timeout 485.16 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3092: mpi_dst_example_simple_lap_s_facto1_sched4_not_tqrcpend Test #3094: mpi_dst_example_simple_lap_s_facto1_sched4_kway_tqrcpend ................***Timeout 485.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3094: mpi_dst_example_simple_lap_s_facto1_sched4_kway_tqrcpend Test #3096: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpend .....***Timeout 484.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3096: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpend Test #3100: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrrtend ................***Timeout 484.84 sec ischedInit: The thread number has been automatically set to 256 Start 3100: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrrtend Test #3102: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtend .....***Timeout 484.63 sec ischedInit: The thread number has been automatically set to 256 Start 3102: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtend Test #3103: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpilu0 ...............***Timeout 484.63 sec ischedInit: The thread number has been automatically set to 256 Start 3103: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpilu0 Test #3104: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpilu1 ...............***Timeout 484.62 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3104: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpilu1 Test #3105: mpi_dst_example_simple_lap_s_facto2_sched4_not_svdbegin .................***Timeout 484.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3105: mpi_dst_example_simple_lap_s_facto2_sched4_not_svdbegin Test #3107: mpi_dst_example_simple_lap_s_facto2_sched4_kway_svdbegin ................***Timeout 484.93 sec ischedInit: The thread number has been automatically set to 256 Start 3107: mpi_dst_example_simple_lap_s_facto2_sched4_kway_svdbegin Test #3108: mpi_dst_example_simple_lap_s_facto2_sched4_kway_svdend ..................***Timeout 484.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3108: mpi_dst_example_simple_lap_s_facto2_sched4_kway_svdend Test #3112: mpi_dst_example_simple_lap_s_facto2_sched4_not_pqrcpend .................***Timeout 483.66 sec ischedInit: The thread number has been automatically set to 256 Start 3112: mpi_dst_example_simple_lap_s_facto2_sched4_not_pqrcpend Test #3115: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpbegin ...***Timeout 483.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3115: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpbegin Test #3116: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpend .....***Timeout 483.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3116: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpend Test #3118: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrcpend .................***Timeout 482.77 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3118: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrcpend Test #3122: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpend .....***Timeout 482.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3122: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpend Test #3124: mpi_dst_example_simple_lap_s_facto2_sched4_not_tqrcpend .................***Timeout 481.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3124: mpi_dst_example_simple_lap_s_facto2_sched4_not_tqrcpend Test #3125: mpi_dst_example_simple_lap_s_facto2_sched4_kway_tqrcpbegin ..............***Timeout 481.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3125: mpi_dst_example_simple_lap_s_facto2_sched4_kway_tqrcpbegin Test #3128: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpend .....***Timeout 480.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3128: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpend Test #3132: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrrtend ................***Timeout 479.20 sec Start 3132: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrrtend Test #3133: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtbegin ...***Timeout 479.20 sec ischedInit: The thread number has been automatically set to 256 Start 3133: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtbegin Test #3134: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtend .....***Timeout 479.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3134: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtend Test #3135: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpilu0 ...............***Timeout 479.52 sec Start 3135: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpilu0 Test #3138: mpi_dst_example_simple_lap_d_facto0_sched4_not_svdend ...................***Timeout 479.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.305627e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.148064e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.076568e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.199037e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.194457e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.483433e-01 s Time to initialize coeftab 6.215342e-02 s Time to factorize 2.015158e+00 s ( 2.51 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.514218e-01 s Start 3138: mpi_dst_example_simple_lap_d_facto0_sched4_not_svdend Test #3143: mpi_dst_example_simple_lap_d_facto0_sched4_not_pqrcpbegin ...............***Timeout 478.81 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3143: mpi_dst_example_simple_lap_d_facto0_sched4_not_pqrcpbegin Test #3144: mpi_dst_example_simple_lap_d_facto0_sched4_not_pqrcpend .................***Timeout 479.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.676214e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.145163e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.526027e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.428053e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.596392e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Start 3144: mpi_dst_example_simple_lap_d_facto0_sched4_not_pqrcpend Test #3147: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpbegin ...***Timeout 479.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.372214e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.209720e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.299789e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.476604e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.219019e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Start 3147: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpbegin Test #3149: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrcpbegin ...............***Timeout 479.48 sec ischedInit: The thread number has been automatically set to 256 Start 3149: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrcpbegin Test #3150: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrcpend .................***Timeout 479.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3150: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrcpend Test #2922: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpend ................***Timeout 476.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #2938: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrrtend .................***Timeout 462.32 sec Test #2939: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrrtbegin ..............***Timeout 462.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #2953: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpbegin ..............***Timeout 456.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2969: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrrtbegin ...............***Timeout 451.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2989: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrcpbegin ...............***Timeout 445.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.642168e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.993942e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.352372e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.079188e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.448373e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.034695e-01 s Time to initialize coeftab 1.580515e-01 s Test #2997: mpi_dst_example_simple_lap_z_facto3_sched1_kway_tqrcpbegin ..............***Timeout 443.04 sec Test #3001: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrrtbegin ...............***Timeout 441.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #3003: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrrtbegin ..............***Timeout 440.83 sec ischedInit: The thread number has been automatically set to 256 Test #3004: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrrtend ................***Timeout 440.82 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Test #3005: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtbegin ...***Timeout 440.80 sec Test #3008: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpilu1 ...............***Timeout 437.77 sec Test #3020: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpend .....***Timeout 433.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #3049: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpbegin ..............***Timeout 424.11 sec ischedInit: The thread number has been automatically set to 256 Test #3057: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpbegin ...***Timeout 420.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Test #3160: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpend .....***Timeout 414.99 sec ischedInit: The thread number has been automatically set to 256 Start 3160: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpend 2949/3626 Test #3189: mpi_dst_example_simple_lap_d_facto1_sched4_kway_tqrcpbegin ..............***Timeout 411.29 sec Start 3189: mpi_dst_example_simple_lap_d_facto1_sched4_kway_tqrcpbegin 2949/3626 Test #3190: mpi_dst_example_simple_lap_d_facto1_sched4_kway_tqrcpend ................***Timeout 412.10 sec Start 3190: mpi_dst_example_simple_lap_d_facto1_sched4_kway_tqrcpend 2949/3626 Test #3191: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpbegin ...***Timeout 412.56 sec Start 3191: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpbegin 2949/3626 Test #3192: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpend .....***Timeout 413.99 sec Start 3192: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpend 2949/3626 Test #3193: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrrtbegin ...............***Timeout 415.49 sec Start 3193: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrrtbegin 2949/3626 Test #3194: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrrtend .................***Timeout 416.56 sec Start 3194: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrrtend 2949/3626 Test #3195: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrrtbegin ..............***Timeout 416.50 sec Start 3195: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrrtbegin 2949/3626 Test #3196: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrrtend ................***Timeout 416.46 sec Start 3196: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrrtend 2949/3626 Test #3197: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtbegin ...***Timeout 417.48 sec Start 3197: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtbegin 2949/3626 Test #3198: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtend .....***Timeout 417.96 sec Start 3198: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtend 2949/3626 Test #3199: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpilu0 ...............***Timeout 417.95 sec Start 3199: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpilu0 2949/3626 Test #3200: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpilu1 ...............***Timeout 417.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.516600e+00 s Start 3200: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpilu1 2949/3626 Test #3201: mpi_dst_example_simple_lap_d_facto2_sched4_not_svdbegin .................***Timeout 417.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3201: mpi_dst_example_simple_lap_d_facto2_sched4_not_svdbegin 2949/3626 Test #3202: mpi_dst_example_simple_lap_d_facto2_sched4_not_svdend ...................***Timeout 417.91 sec Start 3202: mpi_dst_example_simple_lap_d_facto2_sched4_not_svdend 2949/3626 Test #3203: mpi_dst_example_simple_lap_d_facto2_sched4_kway_svdbegin ................***Timeout 417.87 sec Start 3203: mpi_dst_example_simple_lap_d_facto2_sched4_kway_svdbegin 2949/3626 Test #3204: mpi_dst_example_simple_lap_d_facto2_sched4_kway_svdend ..................***Timeout 418.99 sec Start 3204: mpi_dst_example_simple_lap_d_facto2_sched4_kway_svdend 2949/3626 Test #3205: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_svdbegin .....***Timeout 419.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3205: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_svdbegin 2949/3626 Test #3206: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_svdend .......***Timeout 419.42 sec Start 3206: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_svdend 2949/3626 Test #3207: mpi_dst_example_simple_lap_d_facto2_sched4_not_pqrcpbegin ...............***Timeout 419.42 sec Start 3207: mpi_dst_example_simple_lap_d_facto2_sched4_not_pqrcpbegin 2949/3626 Test #3208: mpi_dst_example_simple_lap_d_facto2_sched4_not_pqrcpend .................***Timeout 419.40 sec Start 3208: mpi_dst_example_simple_lap_d_facto2_sched4_not_pqrcpend 2949/3626 Test #3209: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpbegin ..............***Timeout 419.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3209: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpbegin 2949/3626 Test #3211: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpbegin ...***Timeout 420.92 sec Start 3211: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpbegin 2949/3626 Test #3212: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpend .....***Timeout 421.43 sec ischedInit: The thread number has been automatically set to 256 Start 3212: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpend 2949/3626 Test #3213: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrcpbegin ...............***Timeout 421.42 sec Start 3213: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrcpbegin 2949/3626 Test #3214: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrcpend .................***Timeout 421.39 sec Start 3214: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrcpend 2949/3626 Test #3215: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrcpbegin ..............***Timeout 421.36 sec Start 3215: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrcpbegin 2949/3626 Test #3216: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrcpend ................***Timeout 421.32 sec Start 3216: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrcpend 2949/3626 Test #3218: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpend .....***Timeout 421.31 sec Start 3218: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpend 2949/3626 Test #3219: mpi_dst_example_simple_lap_d_facto2_sched4_not_tqrcpbegin ...............***Timeout 421.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3219: mpi_dst_example_simple_lap_d_facto2_sched4_not_tqrcpbegin 2949/3626 Test #3220: mpi_dst_example_simple_lap_d_facto2_sched4_not_tqrcpend .................***Timeout 421.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3220: mpi_dst_example_simple_lap_d_facto2_sched4_not_tqrcpend 2949/3626 Test #3221: mpi_dst_example_simple_lap_d_facto2_sched4_kway_tqrcpbegin ..............***Timeout 421.15 sec Start 3221: mpi_dst_example_simple_lap_d_facto2_sched4_kway_tqrcpbegin 2949/3626 Test #3222: mpi_dst_example_simple_lap_d_facto2_sched4_kway_tqrcpend ................***Timeout 421.14 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3222: mpi_dst_example_simple_lap_d_facto2_sched4_kway_tqrcpend 2949/3626 Test #3223: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpbegin ...***Timeout 421.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3223: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpbegin 2949/3626 Test #3224: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpend .....***Timeout 421.14 sec Start 3224: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpend 2949/3626 Test #3225: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrrtbegin ...............***Timeout 421.19 sec Start 3225: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrrtbegin 2949/3626 Test #3226: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrrtend .................***Timeout 421.28 sec ischedInit: The thread number has been automatically set to 256 Start 3226: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrrtend 2949/3626 Test #3227: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrrtbegin ..............***Timeout 422.01 sec Start 3227: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrrtbegin 2949/3626 Test #3228: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrrtend ................***Timeout 424.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3228: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrrtend 2949/3626 Test #3229: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtbegin ...***Timeout 425.42 sec Start 3229: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtbegin 2949/3626 Test #3230: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtend .....***Timeout 425.78 sec ischedInit: The thread number has been automatically set to 256 Start 3230: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtend 2949/3626 Test #3231: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpilu0 ...............***Timeout 426.15 sec Start 3231: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpilu0 2949/3626 Test #3232: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpilu1 ...............***Timeout 426.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3232: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpilu1 2949/3626 Test #3233: mpi_dst_example_simple_lap_c_facto0_sched4_not_svdbegin .................***Timeout 426.12 sec Start 3233: mpi_dst_example_simple_lap_c_facto0_sched4_not_svdbegin 2949/3626 Test #3234: mpi_dst_example_simple_lap_c_facto0_sched4_not_svdend ...................***Timeout 426.12 sec Start 3234: mpi_dst_example_simple_lap_c_facto0_sched4_not_svdend 2949/3626 Test #3235: mpi_dst_example_simple_lap_c_facto0_sched4_kway_svdbegin ................***Timeout 426.10 sec Start 3235: mpi_dst_example_simple_lap_c_facto0_sched4_kway_svdbegin 2949/3626 Test #3236: mpi_dst_example_simple_lap_c_facto0_sched4_kway_svdend ..................***Timeout 426.32 sec Start 3236: mpi_dst_example_simple_lap_c_facto0_sched4_kway_svdend 2949/3626 Test #3237: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_svdbegin .....***Timeout 426.31 sec Start 3237: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_svdbegin 2949/3626 Test #3238: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_svdend .......***Timeout 426.86 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3238: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_svdend 2949/3626 Test #3239: mpi_dst_example_simple_lap_c_facto0_sched4_not_pqrcpbegin ...............***Timeout 426.85 sec Start 3239: mpi_dst_example_simple_lap_c_facto0_sched4_not_pqrcpbegin 2949/3626 Test #3240: mpi_dst_example_simple_lap_c_facto0_sched4_not_pqrcpend .................***Timeout 426.83 sec Start 3240: mpi_dst_example_simple_lap_c_facto0_sched4_not_pqrcpend 2949/3626 Test #3241: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpbegin ..............***Timeout 426.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.260101e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.100801e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.857994e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.207648e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.147271e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.346399e-01 s Start 3241: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpbegin 2949/3626 Test #3242: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpend ................***Timeout 427.29 sec Start 3242: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpend 2949/3626 Test #3243: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpbegin ...***Timeout 428.43 sec Start 3243: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpbegin 2949/3626 Test #3244: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpend .....***Timeout 429.11 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3244: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpend 2949/3626 Test #3245: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrcpbegin ...............***Timeout 429.50 sec Start 3245: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrcpbegin 2949/3626 Test #3246: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrcpend .................***Timeout 429.48 sec Start 3246: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrcpend 2949/3626 Test #3247: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrcpbegin ..............***Timeout 429.93 sec Start 3247: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrcpbegin 2949/3626 Test #3248: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrcpend ................***Timeout 430.73 sec Start 3248: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrcpend 2949/3626 Test #3249: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpbegin ...***Timeout 430.72 sec Start 3249: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpbegin 2949/3626 Test #3250: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpend .....***Timeout 430.72 sec Start 3250: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpend 2949/3626 Test #3251: mpi_dst_example_simple_lap_c_facto0_sched4_not_tqrcpbegin ...............***Timeout 430.72 sec Start 3251: mpi_dst_example_simple_lap_c_facto0_sched4_not_tqrcpbegin 2949/3626 Test #3253: mpi_dst_example_simple_lap_c_facto0_sched4_kway_tqrcpbegin ..............***Timeout 430.63 sec Start 3253: mpi_dst_example_simple_lap_c_facto0_sched4_kway_tqrcpbegin 2949/3626 Test #3254: mpi_dst_example_simple_lap_c_facto0_sched4_kway_tqrcpend ................***Timeout 430.60 sec Start 3254: mpi_dst_example_simple_lap_c_facto0_sched4_kway_tqrcpend 2949/3626 Test #3255: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpbegin ...***Timeout 430.58 sec Start 3255: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpbegin 2949/3626 Test #3256: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpend .....***Timeout 430.57 sec Start 3256: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpend 2949/3626 Test #3257: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrrtbegin ...............***Timeout 430.56 sec Start 3257: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrrtbegin Test #2905: mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrrtbegin ...............***Timeout 437.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.646771e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.247275e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.589954e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.146747e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.234599e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.656823e-01 s Time to initialize coeftab 1.437200e-01 s Time to factorize 1.184464e+01 s ( 1.71 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.449172e+00 s - iteration 1 : total iteration time 4.75 s error 3.8651e-13 Time for refinement 1.089340e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.865079e-13 max(|| b_i - A x_i ||_1) 7.186339e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.813357e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.865079e-13 max(|| b_i - A x_i ||_1) 7.186339e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.813357e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.865079e-13 max(|| b_i - A x_i ||_1) 7.186339e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.813357e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.865079e-13 max(|| b_i - A x_i ||_1) 7.186339e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.813357e+00 (SUCCESS) Test #3075: mpi_dst_example_simple_lap_s_facto1_sched4_kway_svdbegin ................***Timeout 442.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.034659e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.111993e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.998096e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.897570e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.643341e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.596329e+00 s Time to initialize coeftab 2.972977e+00 s Time to factorize 1.462400e+02 s (36.65 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Start 3075: mpi_dst_example_simple_lap_s_facto1_sched4_kway_svdbegin Test #3101: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtbegin ...***Timeout 449.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.368921e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.538487e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.340255e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.450719e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.841690e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.236874e+00 s Time to initialize coeftab 3.256761e+00 s Time to factorize 2.154530e+01 s (248.74 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.298825e+01 s - iteration 1 : total iteration time 12.5 s error 3.8623e-11 Time for refinement 2.814510e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.957778e-08 max(|| b_i - A x_i ||_1) 2.932920e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.685479e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.957778e-08 max(|| b_i - A x_i ||_1) 2.932920e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.685479e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.957778e-08 max(|| b_i - A x_i ||_1) 2.932920e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.685479e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.957778e-08 max(|| b_i - A x_i ||_1) 2.932920e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.685479e-01 (SUCCESS) Start 3101: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtbegin Test #3111: mpi_dst_example_simple_lap_s_facto2_sched4_not_pqrcpbegin ...............***Timeout 450.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.951754e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.622511e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.162804e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.070675e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.169850e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.239850e-01 s Time to initialize coeftab 5.405444e-01 s Time to factorize 5.752502e+00 s ( 1.74 MFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.264123e+00 s - iteration 1 : total iteration time 4.94 s error 1.2183e-11 Time for refinement 1.210425e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.725964e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.725964e-08 max(|| b_i - A x_i ||_1) 2.747236e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.452150e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.747236e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.452150e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.725964e-08 max(|| b_i - A x_i ||_1) 2.747236e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.452150e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.725964e-08 max(|| b_i - A x_i ||_1) 2.747236e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.452150e-01 (SUCCESS) Start 3111: mpi_dst_example_simple_lap_s_facto2_sched4_not_pqrcpbegin Test #3121: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpbegin ...***Timeout 531.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.354542e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.359296e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.131160e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.044434e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.754859e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.059368e-01 s Time to initialize coeftab 1.117243e+00 s Time to factorize 3.103985e+01 s (329.39 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.999832e+01 s - iteration 1 : total iteration time 6.46 s error 8.6676e-11 Time for refinement 1.211693e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.008497e-08 max(|| b_i - A x_i ||_1) 2.896579e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639814e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.008497e-08 max(|| b_i - A x_i ||_1) 2.896579e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639814e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.008497e-08 max(|| b_i - A x_i ||_1) 2.896579e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639814e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.008497e-08 max(|| b_i - A x_i ||_1) 2.896579e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.639814e-01 (SUCCESS) Start 3121: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpbegin Test #3126: mpi_dst_example_simple_lap_s_facto2_sched4_kway_tqrcpend ................***Timeout 533.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.088163e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.649892e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.557203e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.124367e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.615958e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.346784e+00 s Time to initialize coeftab 3.973091e+00 s Time to factorize 1.783241e+01 s (573.35 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.584894e+01 s - iteration 1 : total iteration time 8.63 s error 1.2962e-12 Time for refinement 1.938256e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.430346e-08 max(|| b_i - A x_i ||_1) 2.695946e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.387699e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.430346e-08 max(|| b_i - A x_i ||_1) 2.695946e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.387699e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.430346e-08 max(|| b_i - A x_i ||_1) 2.695946e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.387699e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.430346e-08 max(|| b_i - A x_i ||_1) 2.695946e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.387699e-01 (SUCCESS) Start 3126: mpi_dst_example_simple_lap_s_facto2_sched4_kway_tqrcpend Test #3142: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_svdend .......***Timeout 538.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.863142e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.116407e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.508715e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.248483e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.422274e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.428134e+00 s Time to initialize coeftab 4.196822e-01 s Time to factorize 3.638897e+01 s (142.45 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3142: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_svdend Test #3145: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpbegin ..............***Timeout 538.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.821396e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.035379e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.119224e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.441094e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.922834e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.587971e+00 s Time to initialize coeftab 2.952692e+00 s Time to factorize 2.553514e+01 s (203.01 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.215424e+01 s - iteration 1 : total iteration time 15 s error 2.6705e-14 Time for refinement 2.677589e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.670827e-14 max(|| b_i - A x_i ||_1) 4.671578e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.870240e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.670827e-14 max(|| b_i - A x_i ||_1) 4.671578e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.870240e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.670827e-14 max(|| b_i - A x_i ||_1) 4.671578e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.870240e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.670827e-14 max(|| b_i - A x_i ||_1) 4.671578e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.870240e-02 (SUCCESS) Start 3145: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpbegin Test #3146: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpend ................***Timeout 538.53 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.070877e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.969970e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.212135e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.518320e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.531405e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.374511e+00 s Time to initialize coeftab 1.027005e+00 s Time to factorize 1.332638e+01 s (388.99 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.290106e+00 s - iteration 1 : total iteration time 23.5 s error 6.4967e-16 Time for refinement 3.240916e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.651152e-16 max(|| b_i - A x_i ||_1) 9.765173e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.227078e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.651152e-16 max(|| b_i - A x_i ||_1) 9.765173e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.227078e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.651152e-16 max(|| b_i - A x_i ||_1) 9.765173e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.227078e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.651152e-16 max(|| b_i - A x_i ||_1) 9.765173e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.227078e-03 (SUCCESS) Start 3146: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpend Test #3148: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpend .....***Timeout 538.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.418799e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.755910e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.196401e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.012862e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.942689e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.650540e+00 s Time to initialize coeftab 1.128593e+00 s Time to factorize 1.779997e+01 s (291.22 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.355915e+00 s - iteration 1 : total iteration time 15.5 s error 2.4665e-15 Time for refinement 2.796935e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.469163e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.469163e-15 max(|| b_i - A x_i ||_1) 2.474554e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.109489e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.469163e-15 max(|| b_i - A x_i ||_1) 2.474554e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.109489e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.474554e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.109489e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.469163e-15 max(|| b_i - A x_i ||_1) 2.474554e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.109489e-03 (SUCCESS) Start 3148: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpend Test #3153: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpbegin ...***Timeout 539.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.606013e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.200334e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.002759e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.397744e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.393374e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.677496e-01 s Time to initialize coeftab 3.012993e-01 s Time to factorize 8.082848e+00 s (641.33 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Time to solve 1.322090e+00 s - iteration 1 : total iteration time 6.61 s error 9.697e-14 Time for refinement 1.046051e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.696862e-14 max(|| b_i - A x_i ||_1) 1.903864e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.392369e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.696862e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.696862e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.696862e-14 max(|| b_i - A x_i ||_1) 1.903864e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.392369e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 1.903864e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.392369e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 1.903864e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.392369e-01 (SUCCESS) Start 3153: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpbegin Test #3156: mpi_dst_example_simple_lap_d_facto0_sched4_not_tqrcpend .................***Timeout 540.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.067591e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.244431e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.496933e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.595564e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.423518e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.529796e+00 s Time to initialize coeftab 1.277569e+00 s Time to factorize 1.497450e+01 s (346.17 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 9.738254e+00 s - iteration 1 : total iteration time 10.2 s error 4.586e-15 Time for refinement 2.238416e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.582635e-15 max(|| b_i - A x_i ||_1) 4.269370e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.364831e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.582635e-15 max(|| b_i - A x_i ||_1) 4.269370e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.364831e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.582635e-15 max(|| b_i - A x_i ||_1) 4.269370e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.364831e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.582635e-15 max(|| b_i - A x_i ||_1) 4.269370e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.364831e-03 (SUCCESS) Start 3156: mpi_dst_example_simple_lap_d_facto0_sched4_not_tqrcpend Test #2923: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpbegin ...***Timeout 544.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.518628e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.960635e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.077354e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.512438e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.753683e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.642028e-01 s Time to initialize coeftab 5.564687e-01 s Time to factorize 2.144088e+01 s (1017.65 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 6.744888e+00 s - iteration 1 : total iteration time 4.73 s error 1.1535e-14 Time for refinement 1.176885e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.153954e-14 max(|| b_i - A x_i ||_1) 1.934870e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.882333e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.153954e-14 max(|| b_i - A x_i ||_1) 1.934870e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.882333e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.153954e-14 max(|| b_i - A x_i ||_1) 1.934870e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.882333e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.153954e-14 max(|| b_i - A x_i ||_1) 1.934870e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.882333e-02 (SUCCESS) Test #2925: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrcpbegin ...............***Timeout 544.99 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.187909e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.769854e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.038285e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.998473e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.732036e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.961240e+00 s Time to initialize coeftab 5.336001e+00 s Time to factorize 8.199913e+01 s (266.09 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 7.846179e+00 s - iteration 1 : total iteration time 27.9 s error 1.5605e-14 Time for refinement 4.153602e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.560674e-14 max(|| b_i - A x_i ||_1) 2.417287e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.099634e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.560674e-14 max(|| b_i - A x_i ||_1) 2.417287e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.099634e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.560674e-14 max(|| b_i - A x_i ||_1) 2.417287e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.099634e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.560674e-14 max(|| b_i - A x_i ||_1) 2.417287e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.099634e-02 (SUCCESS) Test #2932: mpi_dst_example_simple_lap_z_facto1_sched1_not_tqrcpend .................***Timeout 550.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.191711e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.878267e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.252023e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.410925e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.696279e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.286006e-01 s Time to initialize coeftab 8.817922e-01 s Time to factorize 3.084297e+01 s (707.43 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.343541e+00 s - iteration 1 : total iteration time 9.05 s error 2.2782e-15 Time for refinement 1.979522e+01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.280576e-15 max(|| b_i - A x_i ||_1) 1.895749e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.783618e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.280576e-15 max(|| b_i - A x_i ||_1) 1.895749e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.783618e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.280576e-15 max(|| b_i - A x_i ||_1) 1.895749e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.783618e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.280576e-15 max(|| b_i - A x_i ||_1) 1.895749e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.783618e-03 (SUCCESS) Test #2937: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrrtbegin ...............***Timeout 553.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.701969e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.514360e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.708219e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.921137e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.679721e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.829689e+00 s Time to initialize coeftab 2.851240e+00 s Time to factorize 6.694509e+01 s (325.93 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 8.092867e+00 s - iteration 1 : total iteration time 18.7 s error 3.0039e-13 Time for refinement 3.607215e+01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.003969e-13 max(|| b_i - A x_i ||_1) 4.646023e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.172349e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.003969e-13 max(|| b_i - A x_i ||_1) 4.646023e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.172349e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.003969e-13 max(|| b_i - A x_i ||_1) 4.646023e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.172349e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.003969e-13 max(|| b_i - A x_i ||_1) 4.646023e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.172349e+00 (SUCCESS) Test #2941: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtbegin ...***Timeout 554.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.301138e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.322576e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.243298e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.886896e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.685727e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.315385e+00 s Time to initialize coeftab 5.768362e+00 s Time to factorize 6.384207e+01 s (341.77 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 7.930779e+00 s - iteration 1 : total iteration time 14.7 s error 3.7425e-13 Time for refinement 2.524810e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.742536e-13 max(|| b_i - A x_i ||_1) 7.071706e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.784431e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.742536e-13 max(|| b_i - A x_i ||_1) 7.071706e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.784431e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.742536e-13 max(|| b_i - A x_i ||_1) 7.071706e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.784431e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.742536e-13 max(|| b_i - A x_i ||_1) 7.071706e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.784431e+00 (SUCCESS) Test #2944: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpilu1 ...............***Timeout 555.89 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.468210e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.078549e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.047932e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.098840e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.613420e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.827048e-01 s Time to initialize coeftab 1.793983e-01 s Time to factorize 6.925114e+01 s (315.08 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.101455e+01 s - iteration 1 : total iteration time 4.52 s error 8.0786e-15 Time for refinement 1.285510e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.074433e-15 max(|| b_i - A x_i ||_1) 1.223675e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.087746e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.074433e-15 max(|| b_i - A x_i ||_1) 1.223675e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.087746e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.074433e-15 max(|| b_i - A x_i ||_1) 1.223675e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.087746e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.074433e-15 max(|| b_i - A x_i ||_1) 1.223675e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.087746e-02 (SUCCESS) Test #2946: mpi_dst_example_simple_lap_z_facto2_sched1_not_svdend ...................***Timeout 556.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.867250e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.462976e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.903044e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.700648e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.077981e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.302464e-01 s Time to initialize coeftab 8.115315e-01 s Time to factorize 6.840016e+00 s ( 5.84 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 1.337715e+00 s - iteration 1 : total iteration time 2.78 s error 2.1188e-15 Time for refinement 6.449221e+00 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.122452e-15 max(|| b_i - A x_i ||_1) 2.892726e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.299330e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.122452e-15 max(|| b_i - A x_i ||_1) 2.892726e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.299330e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.122452e-15 max(|| b_i - A x_i ||_1) 2.892726e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.299330e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.122452e-15 max(|| b_i - A x_i ||_1) 2.892726e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.299330e-03 (SUCCESS) Test #2947: mpi_dst_example_simple_lap_z_facto2_sched1_kway_svdbegin ................***Timeout 556.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.431745e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.665182e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.770977e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.017061e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.829181e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.022761e+00 s Time to initialize coeftab 4.271407e+00 s Time to factorize 9.766409e+01 s (419.08 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 354 Ko / 355 Ko ------------------------------------------------ Total 450 Ko / 451 Ko Time to solve 6.385726e+00 s - iteration 1 : total iteration time 17.1 s error 1.384e-14 Time for refinement 3.212550e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.383963e-14 max(|| b_i - A x_i ||_1) 2.630967e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.638822e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.383963e-14 max(|| b_i - A x_i ||_1) 2.630967e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.638822e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.383963e-14 max(|| b_i - A x_i ||_1) 2.630967e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.638822e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.383963e-14 max(|| b_i - A x_i ||_1) 2.630967e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.638822e-02 (SUCCESS) Test #2948: mpi_dst_example_simple_lap_z_facto2_sched1_kway_svdend ..................***Timeout 556.65 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.565218e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.768875e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.158183e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.960670e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.200618e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.742443e+00 s Time to initialize coeftab 2.058064e-01 s Time to factorize 3.992794e+01 s ( 1.00 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 4.244768e+00 s - iteration 1 : total iteration time 10.4 s error 4.2498e-15 Time for refinement 1.840914e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.257056e-15 max(|| b_i - A x_i ||_1) 4.173661e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.053156e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.257056e-15 max(|| b_i - A x_i ||_1) 4.173661e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.053156e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.257056e-15 max(|| b_i - A x_i ||_1) 4.173661e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.053156e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.257056e-15 max(|| b_i - A x_i ||_1) 4.173661e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.053156e-02 (SUCCESS) Test #2949: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_svdbegin .....***Timeout 556.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.332554e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.928672e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.576979e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.221385e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.810625e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.639763e+00 s Time to initialize coeftab 3.009079e+00 s Time to factorize 5.874711e+01 s (696.70 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 4.640146e+00 s - iteration 1 : total iteration time 15.2 s error 1.6614e-14 Time for refinement 2.533227e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.661849e-14 max(|| b_i - A x_i ||_1) 2.793501e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.048950e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.661849e-14 max(|| b_i - A x_i ||_1) 2.793501e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.048950e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.661849e-14 max(|| b_i - A x_i ||_1) 2.793501e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.048950e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.661849e-14 max(|| b_i - A x_i ||_1) 2.793501e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.048950e-02 (SUCCESS) Test #2952: mpi_dst_example_simple_lap_z_facto2_sched1_not_pqrcpend .................***Timeout 557.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.659103e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.555319e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.954885e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.760814e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.268374e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.146353e+00 s Time to initialize coeftab 7.630889e-01 s Time to factorize 3.889633e+01 s ( 1.03 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 3.984175e+00 s - iteration 1 : total iteration time 5.43 s error 8.0808e-16 Time for refinement 1.095493e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.119684e-16 max(|| b_i - A x_i ||_1) 1.376328e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.472943e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.119684e-16 max(|| b_i - A x_i ||_1) 1.376328e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.472943e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.119684e-16 max(|| b_i - A x_i ||_1) 1.376328e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.472943e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.119684e-16 max(|| b_i - A x_i ||_1) 1.376328e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.472943e-03 (SUCCESS) Test #2954: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpend ................***Timeout 557.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.117518e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.557436e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.204293e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.925081e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.962464e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.390598e-01 s Time to initialize coeftab 1.003449e-01 s Time to factorize 2.843856e+00 s (14.05 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 9.775170e-01 s - iteration 1 : total iteration time 0.735 s error 1.6193e-15 Time for refinement 2.678983e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.621825e-15 max(|| b_i - A x_i ||_1) 1.886008e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.759038e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.621825e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.621825e-15 max(|| b_i - A x_i ||_1) 1.886008e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.759038e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.621825e-15 max(|| b_i - A x_i ||_1) 1.886008e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.759038e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.886008e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.759038e-03 (SUCCESS) Test #2957: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrcpbegin ...............***Timeout 558.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.788901e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.490813e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.423738e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.339428e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.687191e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.382049e-02 s Time to initialize coeftab 1.454659e-01 s Time to factorize 3.908753e+00 s (10.23 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 3.607965e-01 s - iteration 1 : total iteration time 0.839 s error 1.584e-14 Time for refinement 1.877873e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.583980e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.583980e-14 max(|| b_i - A x_i ||_1) 2.460150e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.207794e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.583980e-14 max(|| b_i - A x_i ||_1) 2.460150e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.207794e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.460150e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.207794e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.583980e-14 max(|| b_i - A x_i ||_1) 2.460150e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.207794e-02 (SUCCESS) Test #2964: mpi_dst_example_simple_lap_z_facto2_sched1_not_tqrcpend .................***Timeout 561.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.316144e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.325509e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.490240e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.280654e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.967059e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.217310e+00 s Time to initialize coeftab 7.248271e-01 s Time to factorize 4.546402e+01 s (900.26 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 1.932767e+00 s - iteration 1 : total iteration time 12.8 s error 2.6908e-15 Time for refinement 1.973093e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.691363e-15 max(|| b_i - A x_i ||_1) 2.647909e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.681572e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.691363e-15 max(|| b_i - A x_i ||_1) 2.647909e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.681572e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.691363e-15 max(|| b_i - A x_i ||_1) 2.647909e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.681572e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.691363e-15 max(|| b_i - A x_i ||_1) 2.647909e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.681572e-03 (SUCCESS) Test #2965: mpi_dst_example_simple_lap_z_facto2_sched1_kway_tqrcpbegin ..............***Timeout 561.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.901180e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.564412e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.166065e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.492843e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.417208e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.764451e+00 s Time to initialize coeftab 1.001283e+01 s Time to factorize 4.449267e+01 s (919.91 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 7.902402e+00 s - iteration 1 : total iteration time 14.9 s error 2.5687e-14 Time for refinement 2.845546e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.569020e-14 max(|| b_i - A x_i ||_1) 3.535711e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.921797e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.569020e-14 max(|| b_i - A x_i ||_1) 3.535711e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.921797e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.569020e-14 max(|| b_i - A x_i ||_1) 3.535711e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.921797e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.569020e-14 max(|| b_i - A x_i ||_1) 3.535711e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.921797e-02 (SUCCESS) Test #2970: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrrtend .................***Timeout 562.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.943085e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.426243e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.731584e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.408427e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.540491e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.726853e+00 s Time to initialize coeftab 1.611556e+00 s Time to factorize 4.496681e+01 s (910.21 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 8.304005e+00 s - iteration 1 : total iteration time 12.9 s error 1.2931e-14 Time for refinement 2.543233e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.293489e-14 max(|| b_i - A x_i ||_1) 1.379262e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.480346e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.293489e-14 max(|| b_i - A x_i ||_1) 1.379262e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.480346e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.293489e-14 max(|| b_i - A x_i ||_1) 1.379262e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.480346e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.293489e-14 max(|| b_i - A x_i ||_1) 1.379262e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.480346e-02 (SUCCESS) Test #2971: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrrtbegin ..............***Timeout 562.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.500777e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.260605e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.524840e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.707081e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.962221e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.974184e+00 s Time to initialize coeftab 7.342864e+00 s Time to factorize 9.812998e+01 s (417.09 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 7.208150e+00 s - iteration 1 : total iteration time 19.5 s error 2.7834e-13 Time for refinement 3.888250e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.783400e-13 max(|| b_i - A x_i ||_1) 4.990499e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.259272e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.783400e-13 max(|| b_i - A x_i ||_1) 4.990499e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.259272e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.783400e-13 max(|| b_i - A x_i ||_1) 4.990499e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.259272e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.783400e-13 max(|| b_i - A x_i ||_1) 4.990499e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.259272e+00 (SUCCESS) Test #2972: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrrtend ................***Timeout 562.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.711267e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.080733e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.631122e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.572834e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.614641e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.142826e+00 s Time to initialize coeftab 1.335890e+00 s Time to factorize 2.523662e+01 s ( 1.58 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 1.153764e+01 s - iteration 1 : total iteration time 17.3 s error 4.0821e-15 Time for refinement 3.244074e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.088340e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.088340e-15 max(|| b_i - A x_i ||_1) 4.782517e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206791e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 4.782517e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206791e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.088340e-15 max(|| b_i - A x_i ||_1) 4.782517e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206791e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.088340e-15 max(|| b_i - A x_i ||_1) 4.782517e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.206791e-02 (SUCCESS) Test #2973: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtbegin ...***Timeout 562.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.691664e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.913010e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.151591e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.698403e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.261137e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.112007e+00 s Time to initialize coeftab 9.093775e+00 s Time to factorize 3.434231e+01 s ( 1.16 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 5.561250e+00 s - iteration 1 : total iteration time 8.23 s error 7.182e-14 Time for refinement 1.676347e+01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.181746e-14 max(|| b_i - A x_i ||_1) 1.553630e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.920336e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.181746e-14 max(|| b_i - A x_i ||_1) 1.553630e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.920336e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.181746e-14 max(|| b_i - A x_i ||_1) 1.553630e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.920336e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.181746e-14 max(|| b_i - A x_i ||_1) 1.553630e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.920336e-01 (SUCCESS) Test #2975: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpilu0 ...............***Timeout 562.58 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.211544e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.328277e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.672214e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.097860e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.949825e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.425489e+00 s Time to initialize coeftab 8.630396e-01 s Time to factorize 4.354406e+01 s (939.95 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 2.153424e+00 s - iteration 1 : total iteration time 5.07 s error 1.1611e-14 Time for refinement 1.197561e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.161173e-14 max(|| b_i - A x_i ||_1) 1.727506e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.359084e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.161173e-14 max(|| b_i - A x_i ||_1) 1.727506e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.359084e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.161173e-14 max(|| b_i - A x_i ||_1) 1.727506e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.359084e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.161173e-14 max(|| b_i - A x_i ||_1) 1.727506e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.359084e-02 (SUCCESS) Test #2976: mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpilu1 ...............***Timeout 562.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.274260e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.184399e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.382566e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.151152e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.734298e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.675286e-01 s Time to initialize coeftab 4.306862e-01 s Time to factorize 4.959927e+01 s (825.20 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 6.391852e+00 s - iteration 1 : total iteration time 20.2 s error 7.7586e-15 Time for refinement 2.972325e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.759708e-15 max(|| b_i - A x_i ||_1) 1.197555e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.021838e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.759708e-15 max(|| b_i - A x_i ||_1) 1.197555e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.021838e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.759708e-15 max(|| b_i - A x_i ||_1) 1.197555e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.021838e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.759708e-15 max(|| b_i - A x_i ||_1) 1.197555e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.021838e-02 (SUCCESS) Test #2983: mpi_dst_example_simple_lap_z_facto3_sched1_not_pqrcpbegin ...............***Timeout 566.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.861040e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.201137e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.418545e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.711646e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.517741e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.667727e-01 s Time to initialize coeftab 1.572445e+00 s Time to factorize 8.262200e+01 s (251.36 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 5.764795e+00 s - iteration 1 : total iteration time 13.5 s error 1.3934e-14 Time for refinement 2.395214e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.393778e-14 max(|| b_i - A x_i ||_1) 2.219387e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.600266e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.393778e-14 max(|| b_i - A x_i ||_1) 2.219387e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.600266e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.393778e-14 max(|| b_i - A x_i ||_1) 2.219387e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.600266e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.393778e-14 max(|| b_i - A x_i ||_1) 2.219387e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.600266e-02 (SUCCESS) Test #2986: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpend ................***Timeout 566.82 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.368991e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.242192e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.410794e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.646504e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.498249e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.715802e-01 s Time to initialize coeftab 1.628927e-01 s Time to factorize 5.550016e+00 s ( 3.65 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 2.284708e+00 s - iteration 1 : total iteration time 8.36 s error 3.3562e-16 Time for refinement 1.454827e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.610729e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.610729e-16 max(|| b_i - A x_i ||_1) 8.474923e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.138510e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.610729e-16 max(|| b_i - A x_i ||_1) 8.474923e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.138510e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.474923e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.138510e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.610729e-16 max(|| b_i - A x_i ||_1) 8.474923e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.138510e-03 (SUCCESS) Test #2991: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrcpbegin ..............***Timeout 568.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.402778e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.914004e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.813844e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.284900e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.836309e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.990223e-01 s Time to initialize coeftab 6.289377e-01 s Time to factorize 6.169947e+01 s (336.59 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 8.160687e+00 s - iteration 1 : total iteration time 22 s error 1.8557e-14 Time for refinement 2.956163e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.855422e-14 max(|| b_i - A x_i ||_1) 3.001273e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.573230e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.855422e-14 max(|| b_i - A x_i ||_1) 3.001273e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.573230e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.855422e-14 max(|| b_i - A x_i ||_1) 3.001273e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.573230e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.855422e-14 max(|| b_i - A x_i ||_1) 3.001273e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.573230e-02 (SUCCESS) Test #2992: mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrcpend ................***Timeout 568.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.311755e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.322448e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.731981e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.129404e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.669656e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.698994e-01 s Time to initialize coeftab 1.739876e+00 s Time to factorize 1.587229e+01 s ( 1.28 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 2.726545e+00 s - iteration 1 : total iteration time 8.09 s error 3.8592e-16 Time for refinement 2.191593e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.069051e-16 max(|| b_i - A x_i ||_1) 9.314966e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.350482e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.069051e-16 max(|| b_i - A x_i ||_1) 9.314966e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.350482e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.069051e-16 max(|| b_i - A x_i ||_1) 9.314966e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.350482e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.069051e-16 max(|| b_i - A x_i ||_1) 9.314966e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.350482e-03 (SUCCESS) Test #2996: mpi_dst_example_simple_lap_z_facto3_sched1_not_tqrcpend .................***Timeout 570.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.879552e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.061853e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.260076e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.747503e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.993536e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.484420e+00 s Time to initialize coeftab 7.464942e-01 s Time to factorize 2.957109e+01 s (702.30 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.006247e+01 s - iteration 1 : total iteration time 16.1 s error 3.8623e-16 Time for refinement 2.742598e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.078554e-16 max(|| b_i - A x_i ||_1) 9.092361e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.294311e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.078554e-16 max(|| b_i - A x_i ||_1) 9.092361e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.294311e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.078554e-16 max(|| b_i - A x_i ||_1) 9.092361e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.294311e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.078554e-16 max(|| b_i - A x_i ||_1) 9.092361e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.294311e-03 (SUCCESS) Test #2998: mpi_dst_example_simple_lap_z_facto3_sched1_kway_tqrcpend ................***Timeout 570.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.114713e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.789817e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.388696e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.756664e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.761911e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.423778e+00 s Time to initialize coeftab 7.120022e+00 s Time to factorize 2.525369e+01 s (822.36 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 4.592703e+00 s - iteration 1 : total iteration time 10.7 s error 5.2673e-17 Time for refinement 2.260941e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.426915e-16 max(|| b_i - A x_i ||_1) 6.853993e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.729495e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.426915e-16 max(|| b_i - A x_i ||_1) 6.853993e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.729495e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.426915e-16 max(|| b_i - A x_i ||_1) 6.853993e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.729495e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.426915e-16 max(|| b_i - A x_i ||_1) 6.853993e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.729495e-03 (SUCCESS) Test #3002: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrrtend .................***Timeout 570.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.773839e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.170056e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.704643e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.669621e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.150390e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.998800e+00 s Time to initialize coeftab 1.860892e+00 s Time to factorize 2.036451e+01 s (1019.80 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Test #3012: mpi_dst_example_simple_lap_z_facto4_sched1_kway_svdend ..................***Timeout 572.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.318769e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.038716e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.412378e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.181219e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.334338e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.565234e+00 s Time to initialize coeftab 4.090245e-01 s Time to factorize 1.107365e+02 s (197.04 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 7.240705e+00 s - iteration 1 : total iteration time 13.1 s error 3.1718e-16 Time for refinement 2.723350e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.419739e-16 max(|| b_i - A x_i ||_1) 8.407503e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.121498e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.419739e-16 max(|| b_i - A x_i ||_1) 8.407503e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.121498e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.419739e-16 max(|| b_i - A x_i ||_1) 8.407503e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.121498e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.419739e-16 max(|| b_i - A x_i ||_1) 8.407503e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.121498e-03 (SUCCESS) Test #3021: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrcpbegin ...............***Timeout 575.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.636208e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.224535e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.003631e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.111567e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.379203e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.887619e-01 s Time to initialize coeftab 6.229298e-01 s Time to factorize 8.818060e+01 s (247.44 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.681199e+01 s - iteration 1 : total iteration time 2.54 s error 2.4464e-14 Time for refinement 9.971488e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.446358e-14 max(|| b_i - A x_i ||_1) 3.408155e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.599931e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.446358e-14 max(|| b_i - A x_i ||_1) 3.408155e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.599931e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.446358e-14 max(|| b_i - A x_i ||_1) 3.408155e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.599931e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.446358e-14 max(|| b_i - A x_i ||_1) 3.408155e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.599931e-02 (SUCCESS) Test #3022: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrcpend .................***Timeout 575.92 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.116771e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.373539e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.393092e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.546091e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.575990e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.441035e+00 s Time to initialize coeftab 6.678011e-01 s Time to factorize 4.536067e+01 s (481.02 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.799342e+00 s - iteration 1 : total iteration time 8.13 s error 6.9824e-16 Time for refinement 1.590760e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 Test #3024: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrcpend ................***Timeout 576.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.030044e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.865178e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.957734e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.832333e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.472544e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.020317e+00 s Time to initialize coeftab 6.494919e-01 s Time to factorize 2.239691e+01 s (974.21 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.901119e+00 s - iteration 1 : total iteration time 5.29 s error 2.4051e-16 Time for refinement 9.450226e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.726714e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.726714e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.726714e-16 max(|| b_i - A x_i ||_1) 8.266982e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.086040e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.266982e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.086040e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.266982e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.086040e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.726714e-16 max(|| b_i - A x_i ||_1) 8.266982e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.086040e-03 (SUCCESS) Test #3028: mpi_dst_example_simple_lap_z_facto4_sched1_not_tqrcpend .................***Timeout 578.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.129875e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.370481e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.476003e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.903049e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.541552e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.674732e+00 s Time to initialize coeftab 5.213052e-01 s Time to factorize 2.856327e+01 s (763.90 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.012530e+00 s - iteration 1 : total iteration time 5.9 s error 3.3129e-16 Time for refinement 1.934496e+01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.479136e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.479136e-16 max(|| b_i - A x_i ||_1) 8.238782e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.078924e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.238782e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.078924e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.479136e-16 max(|| b_i - A x_i ||_1) 8.238782e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.078924e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.479136e-16 max(|| b_i - A x_i ||_1) 8.238782e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.078924e-03 (SUCCESS) Test #3030: mpi_dst_example_simple_lap_z_facto4_sched1_kway_tqrcpend ................***Timeout 578.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.800591e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.725126e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.421695e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.357405e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.883566e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.484999e+00 s Time to initialize coeftab 6.497772e-01 s Time to factorize 3.062007e+01 s (712.58 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 7.956778e+00 s - iteration 1 : total iteration time 18.5 s error 3.7657e-16 Time for refinement 3.099361e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.010367e-16 max(|| b_i - A x_i ||_1) 9.160436e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.311489e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.010367e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.010367e-16 max(|| b_i - A x_i ||_1) 9.160436e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.311489e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 9.160436e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.311489e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.010367e-16 max(|| b_i - A x_i ||_1) 9.160436e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.311489e-03 (SUCCESS) Test #3032: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpend .....***Timeout 579.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.060845e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.637580e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.378888e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.333340e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.731414e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.445295e+00 s Time to initialize coeftab 2.773951e-01 s Time to factorize 4.168343e+01 s (523.45 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 8.429719e+00 s - iteration 1 : total iteration time 16.2 s error 2.7306e-15 Time for refinement 2.639984e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.736221e-15 max(|| b_i - A x_i ||_1) 2.346927e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.922092e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.736221e-15 max(|| b_i - A x_i ||_1) 2.346927e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.922092e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.736221e-15 max(|| b_i - A x_i ||_1) 2.346927e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.922092e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.736221e-15 max(|| b_i - A x_i ||_1) 2.346927e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.922092e-03 (SUCCESS) Test #3034: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrrtend .................***Timeout 579.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.782701e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.013423e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.031681e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.034833e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.223342e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.046086e+00 s Time to initialize coeftab 2.162509e-01 s Time to factorize 4.017910e+01 s (543.05 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 5.815007e+00 s - iteration 1 : total iteration time 4.55 s error 2.6126e-15 Time for refinement 1.450173e+01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.626104e-15 max(|| b_i - A x_i ||_1) 2.769633e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.988723e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.626104e-15 max(|| b_i - A x_i ||_1) 2.769633e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.988723e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.626104e-15 max(|| b_i - A x_i ||_1) 2.769633e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.988723e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.626104e-15 max(|| b_i - A x_i ||_1) 2.769633e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.988723e-03 (SUCCESS) Test #3042: mpi_dst_example_simple_lap_s_facto0_sched4_not_svdend ...................***Timeout 583.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.176805e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.390981e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.064172e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.050025e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.592507e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.047578e-01 s Time to initialize coeftab 1.903023e-01 s Time to factorize 4.873170e+00 s ( 1.04 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 4.718017e+00 s Time for refinement 4.650570e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.925842e-07 max(|| b_i - A x_i ||_1) 8.575063e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.077534e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.925842e-07 max(|| b_i - A x_i ||_1) 8.575063e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.077534e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.925842e-07 max(|| b_i - A x_i ||_1) 8.575063e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.077534e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.925842e-07 max(|| b_i - A x_i ||_1) 8.575063e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.077534e+00 (SUCCESS) Test #3044: mpi_dst_example_simple_lap_s_facto0_sched4_kway_svdend ..................***Timeout 584.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.734493e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.851231e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.531590e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.872285e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.332157e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.843185e+00 s Time to initialize coeftab 1.518820e+00 s Time to factorize 3.606865e+01 s (143.72 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Test #3048: mpi_dst_example_simple_lap_s_facto0_sched4_not_pqrcpend .................***Timeout 584.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.966409e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.690549e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.947754e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.438612e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.502189e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.537364e+00 s Time to initialize coeftab 8.099286e-01 s Time to factorize 1.357868e+01 s (381.76 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.214902e+01 s Time for refinement 2.356426e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.987880e-07 max(|| b_i - A x_i ||_1) 8.676070e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.090227e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.987880e-07 max(|| b_i - A x_i ||_1) 8.676070e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.090227e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.987880e-07 max(|| b_i - A x_i ||_1) 8.676070e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.090227e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.987880e-07 max(|| b_i - A x_i ||_1) 8.676070e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.090227e+00 (SUCCESS) Test #3050: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpend ................***Timeout 584.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.471401e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.089347e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.644746e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.693524e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.852376e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.639869e+00 s Time to initialize coeftab 5.956324e-01 s Time to factorize 1.369227e+01 s (378.59 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 4.919115e+00 s - iteration 1 : total iteration time 11.4 s error 2.2444e-11 Time for refinement 2.348434e+01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.877758e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.877758e-08 max(|| b_i - A x_i ||_1) 2.843605e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.573247e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.843605e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.573247e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.877758e-08 max(|| b_i - A x_i ||_1) 2.843605e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.573247e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.877758e-08 max(|| b_i - A x_i ||_1) 2.843605e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.573247e-01 (SUCCESS) Test #3051: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpbegin ...***Timeout 584.42 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.903161e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.594820e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.524907e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.506289e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.133445e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.974445e+00 s Time to initialize coeftab 1.650019e+00 s Time to factorize 2.410775e+01 s (215.03 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko Test #3052: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpend .....***Timeout 584.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.619661e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.432640e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.222961e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.486778e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.431116e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.496298e+00 s Time to initialize coeftab 6.552507e-01 s Time to factorize 7.540948e+00 s (687.42 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.547127e+01 s - iteration 1 : total iteration time 6.25 s error 2.1449e-11 Time for refinement 1.970234e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.832942e-08 max(|| b_i - A x_i ||_1) 2.873832e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.611229e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.832942e-08 max(|| b_i - A x_i ||_1) 2.873832e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.611229e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.832942e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.832942e-08 max(|| b_i - A x_i ||_1) 2.873832e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.611229e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.873832e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.611229e-01 (SUCCESS) Test #3053: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrcpbegin ...............***Timeout 584.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.797443e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.855093e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.455904e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.797799e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.430231e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.796418e+00 s Time to initialize coeftab 6.947543e+00 s Time to factorize 1.616198e+01 s (320.74 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 3.967645e+00 s - iteration 1 : total iteration time 10.7 s error 4.9514e-11 Time for refinement 1.666449e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.157901e-08 max(|| b_i - A x_i ||_1) 3.011832e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.784639e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.157901e-08 max(|| b_i - A x_i ||_1) 3.011832e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.784639e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.157901e-08 max(|| b_i - A x_i ||_1) 3.011832e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.784639e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.157901e-08 max(|| b_i - A x_i ||_1) 3.011832e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.784639e-01 (SUCCESS) Test #3055: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrcpbegin ..............***Timeout 584.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.957784e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.727673e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.779708e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.828506e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.803004e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.147124e-01 s Time to initialize coeftab 3.069055e-01 s Time to factorize 6.553762e+00 s (790.96 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Time to solve 1.307285e+00 s - iteration 1 : total iteration time 8.01 s error 6.0861e-11 Time for refinement 1.203931e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.932268e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.932268e-08 max(|| b_i - A x_i ||_1) 2.927176e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.678261e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.932268e-08 max(|| b_i - A x_i ||_1) 2.927176e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.678261e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.927176e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.678261e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.932268e-08 max(|| b_i - A x_i ||_1) 2.927176e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.678261e-01 (SUCCESS) Test #3060: mpi_dst_example_simple_lap_s_facto0_sched4_not_tqrcpend .................***Timeout 584.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.845941e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.750446e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.874505e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.271161e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.301741e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.609589e+00 s Time to initialize coeftab 6.720373e-01 s Time to factorize 1.998517e+01 s (259.38 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 9.307604e+00 s Time for refinement 2.798872e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.684943e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.684943e-07 max(|| b_i - A x_i ||_1) 1.177793e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.480003e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.684943e-07 max(|| b_i - A x_i ||_1) 1.177793e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.480003e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.177793e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.480003e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.684943e-07 max(|| b_i - A x_i ||_1) 1.177793e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.480003e+00 (SUCCESS) Test #3061: mpi_dst_example_simple_lap_s_facto0_sched4_kway_tqrcpbegin ..............***Timeout 584.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.613495e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.810088e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.208041e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.009869e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.638779e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.057306e+00 s Time to initialize coeftab 4.338854e+00 s Time to factorize 2.054604e+01 s (252.30 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 5.685929e+00 s - iteration 1 : total iteration time 8.74 s error 4.5929e-11 Time for refinement 1.760568e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.362843e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.362843e-08 max(|| b_i - A x_i ||_1) 3.063318e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.849337e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.063318e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.849337e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.362843e-08 max(|| b_i - A x_i ||_1) 3.063318e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.849337e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.362843e-08 max(|| b_i - A x_i ||_1) 3.063318e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.849337e-01 (SUCCESS) Test #3062: mpi_dst_example_simple_lap_s_facto0_sched4_kway_tqrcpend ................***Timeout 584.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.333345e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.055799e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.691829e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.945157e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.304773e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.270532e-01 s Time to initialize coeftab 6.403127e-02 s Time to factorize 2.994389e+00 s ( 1.69 MFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.313003e+00 s Time for refinement 1.058032e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.055277e-07 max(|| b_i - A x_i ||_1) 1.106472e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.390383e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.055277e-07 max(|| b_i - A x_i ||_1) 1.106472e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.390383e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.055277e-07 max(|| b_i - A x_i ||_1) 1.106472e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.390383e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.055277e-07 max(|| b_i - A x_i ||_1) 1.106472e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.390383e+00 (SUCCESS) Test #3164: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrrtend ................***Timeout 585.84 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.998005e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.488846e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.731046e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.824353e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.647239e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.627550e+00 s Time to initialize coeftab 6.467349e-01 s Time to factorize 1.613335e+01 s (321.31 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.336245e+00 s - iteration 1 : total iteration time 11.6 s error 2.9385e-13 Time for refinement 1.825139e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.938435e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.938435e-13 max(|| b_i - A x_i ||_1) 2.852409e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.584297e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.938435e-13 max(|| b_i - A x_i ||_1) 2.852409e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.584297e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.938435e-13 max(|| b_i - A x_i ||_1) 2.852409e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.584297e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.852409e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.584297e-01 (SUCCESS) Start 3164: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrrtend 2997/3626 Test #3260: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrrtend ................***Timeout 569.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.189017e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.082304e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.929302e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.769802e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.509737e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.074034e+00 s Time to initialize coeftab 1.388769e+00 s Time to factorize 9.218974e+01 s (225.27 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.066944e+01 s Time for refinement 3.454501e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.946820e-07 max(|| b_i - A x_i ||_1) 8.823629e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.226509e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.946820e-07 max(|| b_i - A x_i ||_1) 8.823629e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.226509e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.946820e-07 max(|| b_i - A x_i ||_1) 8.823629e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.226509e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.946820e-07 max(|| b_i - A x_i ||_1) 8.823629e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.226509e+00 (SUCCESS) Start 3260: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrrtend 2997/3626 Test #3263: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpilu0 ...............***Timeout 570.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.738108e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.499429e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.367506e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.672391e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.250000e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.438281e+00 s Time to initialize coeftab 6.509171e-01 s Time to factorize 2.920258e+01 s (711.16 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.202036e+00 s - iteration 1 : total iteration time 7.58 s error 2.9482e-11 Time for refinement 1.657188e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.531331e-08 max(|| b_i - A x_i ||_1) 3.308354e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.348128e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.531331e-08 max(|| b_i - A x_i ||_1) 3.308354e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.348128e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.531331e-08 max(|| b_i - A x_i ||_1) 3.308354e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.348128e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.531331e-08 max(|| b_i - A x_i ||_1) 3.308354e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.348128e-01 (SUCCESS) Start 3263: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpilu0 2997/3626 Test #3298: mpi_dst_example_simple_lap_c_facto2_sched4_not_svdend ...................***Timeout 585.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.057972e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.889062e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.025463e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.388950e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.988659e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.399450e+00 s Time to initialize coeftab 1.036336e+00 s Time to factorize 6.044771e+01 s (677.10 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 7.249864e+00 s Time for refinement 1.505603e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.753400e-07 max(|| b_i - A x_i ||_1) 7.598307e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.917317e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.753400e-07 max(|| b_i - A x_i ||_1) 7.598307e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.917317e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.753400e-07 max(|| b_i - A x_i ||_1) 7.598307e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.917317e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.753400e-07 max(|| b_i - A x_i ||_1) 7.598307e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.917317e+00 (SUCCESS) Start 3298: mpi_dst_example_simple_lap_c_facto2_sched4_not_svdend 2997/3626 Test #3304: mpi_dst_example_simple_lap_c_facto2_sched4_not_pqrcpend .................***Timeout 587.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.667696e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.885198e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.618917e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.033536e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.229708e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.060293e+00 s Time to initialize coeftab 5.789794e-01 s Time to factorize 2.886944e+01 s ( 1.38 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.652744e+00 s - iteration 1 : total iteration time 13.7 s error 2.628e-12 Time for refinement 1.935889e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.275081e-08 max(|| b_i - A x_i ||_1) 3.123402e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.881431e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.275081e-08 max(|| b_i - A x_i ||_1) 3.123402e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.881431e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.275081e-08 max(|| b_i - A x_i ||_1) 3.123402e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.881431e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.275081e-08 max(|| b_i - A x_i ||_1) 3.123402e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.881431e-01 (SUCCESS) Start 3304: mpi_dst_example_simple_lap_c_facto2_sched4_not_pqrcpend 2997/3626 Test #3306: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpend ................***Timeout 587.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.233295e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.320059e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.189899e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.464940e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.439632e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.026919e-01 s Time to initialize coeftab 2.130025e-01 s Time to factorize 2.180648e+01 s ( 1.83 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 6.466941e+00 s - iteration 1 : total iteration time 14.2 s error 8.7639e-13 Time for refinement 2.065091e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.230988e-08 max(|| b_i - A x_i ||_1) 3.088579e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.793559e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.230988e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.230988e-08 max(|| b_i - A x_i ||_1) 3.088579e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.793559e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.088579e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.793559e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.230988e-08 max(|| b_i - A x_i ||_1) 3.088579e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.793559e-01 (SUCCESS) Start 3306: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpend 2997/3626 Test #3307: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpbegin ...***Timeout 587.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch 1: 300 1140 2: 200 760 3: 200 660 Time to compute ordering 1.373412e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.939738e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.672947e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.968078e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.594364e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.269767e-01 s Time to initialize coeftab 1.793046e+00 s Time to factorize 4.194303e+01 s (975.83 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 5.895002e+00 s - iteration 1 : total iteration time 14.7 s error 5.4229e-11 Time for refinement 3.284958e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.349013e-08 max(|| b_i - A x_i ||_1) 3.170071e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.999192e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.349013e-08 max(|| b_i - A x_i ||_1) 3.170071e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.999192e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.349013e-08 max(|| b_i - A x_i ||_1) 3.170071e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.999192e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.349013e-08 max(|| b_i - A x_i ||_1) 3.170071e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.999192e-01 (SUCCESS) Start 3307: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpbegin 2997/3626 Test #3308: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpend .....***Timeout 588.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.299643e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.445528e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.894488e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.995522e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.800244e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.226045e-01 s Time to initialize coeftab 1.880537e+00 s Time to factorize 1.534831e+01 s ( 2.60 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 8.364558e+00 s - iteration 1 : total iteration time 4.99 s error 1.2103e-12 Time for refinement 1.167148e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.248042e-08 max(|| b_i - A x_i ||_1) 3.067113e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.739394e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.248042e-08 max(|| b_i - A x_i ||_1) 3.067113e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.739394e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.248042e-08 max(|| b_i - A x_i ||_1) 3.067113e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.739394e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.248042e-08 max(|| b_i - A x_i ||_1) 3.067113e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.739394e-01 (SUCCESS) Start 3308: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpend 2997/3626 Test #3309: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrcpbegin ...............***Timeout 588.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.302949e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.340152e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.758960e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.974830e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.277293e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.440357e-01 s Time to initialize coeftab 1.859064e+00 s Time to factorize 4.296470e+01 s (952.63 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 6.381077e+00 s - iteration 1 : total iteration time 30.1 s error 6.2768e-12 Time for refinement 4.463611e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.412974e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.412974e-08 max(|| b_i - A x_i ||_1) 3.201125e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.077552e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.201125e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.077552e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.412974e-08 max(|| b_i - A x_i ||_1) 3.201125e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.077552e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.412974e-08 max(|| b_i - A x_i ||_1) 3.201125e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.077552e-01 (SUCCESS) Start 3309: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrcpbegin 2997/3626 Test #3310: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrcpend .................***Timeout 588.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.135858e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.475376e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.775679e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.771039e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.393919e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.047990e-02 s Time to initialize coeftab 1.245371e-01 s Time to factorize 1.884825e+01 s ( 2.12 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 4.094079e+00 s - iteration 1 : total iteration time 10.2 s error 3.6852e-12 Time for refinement 1.851492e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.263058e-08 max(|| b_i - A x_i ||_1) 3.093207e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.805239e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.263058e-08 max(|| b_i - A x_i ||_1) 3.093207e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.805239e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.263058e-08 max(|| b_i - A x_i ||_1) 3.093207e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.805239e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.263058e-08 max(|| b_i - A x_i ||_1) 3.093207e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.805239e-01 (SUCCESS) Start 3310: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrcpend 2997/3626 Test #3311: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrcpbegin ..............***Timeout 588.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.620302e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.513013e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.330687e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.725734e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.004643e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.116286e+00 s Time to initialize coeftab 4.008980e+00 s Time to factorize 7.324205e+01 s (558.82 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 9.164378e+00 s - iteration 1 : total iteration time 5.53 s error 5.2829e-11 Time for refinement 9.801904e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.602102e-08 max(|| b_i - A x_i ||_1) 3.274272e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.262127e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.602102e-08 max(|| b_i - A x_i ||_1) 3.274272e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.262127e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.602102e-08 max(|| b_i - A x_i ||_1) 3.274272e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.262127e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.602102e-08 max(|| b_i - A x_i ||_1) 3.274272e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.262127e-01 (SUCCESS) Start 3311: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrcpbegin 2997/3626 Test #3312: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrcpend ................***Timeout 588.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.639074e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.725083e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.048071e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.301156e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.405997e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.645424e-01 s Time to initialize coeftab 4.649478e-02 s Time to factorize 3.349049e+00 s (11.93 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.735521e+00 s - iteration 1 : total iteration time 1.08 s error 1.0536e-12 Time for refinement 2.116300e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.224887e-08 max(|| b_i - A x_i ||_1) 3.092714e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.803993e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.224887e-08 max(|| b_i - A x_i ||_1) 3.092714e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.803993e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.224887e-08 max(|| b_i - A x_i ||_1) 3.092714e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.803993e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.224887e-08 max(|| b_i - A x_i ||_1) 3.092714e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.803993e-01 (SUCCESS) Start 3312: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrcpend 2997/3626 Test #3313: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpbegin ...***Timeout 588.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.200651e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.291173e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.356951e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.263037e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.200183e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.273793e-01 s Time to initialize coeftab 1.329122e+00 s Time to factorize 3.793011e+01 s ( 1.05 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 4.264779e+00 s - iteration 1 : total iteration time 19 s error 5.9981e-11 Time for refinement 2.269688e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.653448e-08 max(|| b_i - A x_i ||_1) 3.262976e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.233625e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.653448e-08 max(|| b_i - A x_i ||_1) 3.262976e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.233625e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.653448e-08 max(|| b_i - A x_i ||_1) 3.262976e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.233625e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.653448e-08 max(|| b_i - A x_i ||_1) 3.262976e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.233625e-01 (SUCCESS) Start 3313: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpbegin 2997/3626 Test #3314: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpend .....***Timeout 588.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.490351e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.395849e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.071111e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.647303e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.276127e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.110087e-01 s Time to initialize coeftab 1.224986e-01 s Time to factorize 9.574781e+00 s ( 4.17 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.615053e+00 s - iteration 1 : total iteration time 9.16 s error 1.0524e-12 Time for refinement 1.867460e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.159996e-08 max(|| b_i - A x_i ||_1) 3.060824e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.723525e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.159996e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.159996e-08 max(|| b_i - A x_i ||_1) 3.060824e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.723525e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.159996e-08 max(|| b_i - A x_i ||_1) 3.060824e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.723525e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.060824e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.723525e-01 (SUCCESS) Start 3314: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpend 2997/3626 Test #3315: mpi_dst_example_simple_lap_c_facto2_sched4_not_tqrcpbegin ...............***Timeout 588.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.392898e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.099969e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.044285e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.458089e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.145275e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.000442e+00 s Time to initialize coeftab 3.952012e+00 s Time to factorize 9.306675e+01 s (439.78 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 8.621613e+00 s - iteration 1 : total iteration time 8.23 s error 1.1268e-11 Time for refinement 1.581059e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.425153e-08 max(|| b_i - A x_i ||_1) 3.195701e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.063865e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.425153e-08 max(|| b_i - A x_i ||_1) 3.195701e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.063865e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.425153e-08 max(|| b_i - A x_i ||_1) 3.195701e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.063865e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.425153e-08 max(|| b_i - A x_i ||_1) 3.195701e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.063865e-01 (SUCCESS) Start 3315: mpi_dst_example_simple_lap_c_facto2_sched4_not_tqrcpbegin 2997/3626 Test #3316: mpi_dst_example_simple_lap_c_facto2_sched4_not_tqrcpend .................***Timeout 588.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.709765e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.005806e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.072883e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.014229e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.184156e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.822085e+00 s Time to initialize coeftab 1.467158e+00 s Time to factorize 5.050839e+01 s (810.35 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.478895e+00 s - iteration 1 : total iteration time 18.2 s error 1.0757e-12 Time for refinement 3.082759e+01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.161492e-08 max(|| b_i - A x_i ||_1) 3.085081e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.784733e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.161492e-08 max(|| b_i - A x_i ||_1) 3.085081e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.784733e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.161492e-08 max(|| b_i - A x_i ||_1) 3.085081e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.784733e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.161492e-08 max(|| b_i - A x_i ||_1) 3.085081e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.784733e-01 (SUCCESS) Start 3316: mpi_dst_example_simple_lap_c_facto2_sched4_not_tqrcpend 2997/3626 Test #3317: mpi_dst_example_simple_lap_c_facto2_sched4_kway_tqrcpbegin ..............***Timeout 588.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.421413e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.104850e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.194126e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.951080e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.711821e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.648532e-01 s Time to initialize coeftab 1.426737e+00 s Time to factorize 8.617316e+01 s (474.97 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 4.122017e+00 s - iteration 1 : total iteration time 5.63 s error 5.345e-11 Time for refinement 9.509522e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.368128e-08 max(|| b_i - A x_i ||_1) 3.219894e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.124914e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.368128e-08 max(|| b_i - A x_i ||_1) 3.219894e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.124914e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.368128e-08 max(|| b_i - A x_i ||_1) 3.219894e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.124914e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.368128e-08 max(|| b_i - A x_i ||_1) 3.219894e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.124914e-01 (SUCCESS) Start 3317: mpi_dst_example_simple_lap_c_facto2_sched4_kway_tqrcpbegin 2997/3626 Test #3318: mpi_dst_example_simple_lap_c_facto2_sched4_kway_tqrcpend ................***Timeout 588.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.914655e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.866191e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.660767e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.643461e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.319611e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.663539e+00 s Time to initialize coeftab 9.130721e-01 s Time to factorize 1.077518e+02 s (379.85 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko Start 3318: mpi_dst_example_simple_lap_c_facto2_sched4_kway_tqrcpend 2997/3626 Test #3319: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpbegin ...***Timeout 588.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.834051e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.770785e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.005793e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.394896e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.743518e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.599574e-01 s Time to initialize coeftab 8.846921e-01 s Time to factorize 6.023701e+01 s (679.47 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 9.162950e+00 s - iteration 1 : total iteration time 14.4 s error 6.7292e-11 Time for refinement 2.755901e+01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.447465e-08 max(|| b_i - A x_i ||_1) 3.146706e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.940235e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.447465e-08 max(|| b_i - A x_i ||_1) 3.146706e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.940235e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.447465e-08 max(|| b_i - A x_i ||_1) 3.146706e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.940235e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.447465e-08 max(|| b_i - A x_i ||_1) 3.146706e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.940235e-01 (SUCCESS) Start 3319: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpbegin 2997/3626 Test #3320: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpend .....***Timeout 588.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.494036e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.073628e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.625209e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.581493e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.998756e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.518447e+00 s Time to initialize coeftab 3.336635e-01 s Time to factorize 4.406652e+01 s (928.81 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.234248e+01 s - iteration 1 : total iteration time 10.9 s error 3.0305e-12 Time for refinement 1.603814e+01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.167414e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.167414e-08 max(|| b_i - A x_i ||_1) 3.082472e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.778149e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.082472e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.778149e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.167414e-08 max(|| b_i - A x_i ||_1) 3.082472e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.778149e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.167414e-08 max(|| b_i - A x_i ||_1) 3.082472e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.778149e-01 (SUCCESS) Start 3320: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpend 2997/3626 Test #3321: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrrtbegin ...............***Timeout 588.99 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.398246e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.502517e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.304747e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.972411e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.710734e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.997370e-01 s Time to initialize coeftab 4.435938e+00 s Time to factorize 5.386738e+01 s (759.82 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.985251e+00 s - iteration 1 : total iteration time 13.7 s error 5.5757e-11 Time for refinement 1.983399e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.411967e-08 max(|| b_i - A x_i ||_1) 3.226982e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.142798e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.411967e-08 max(|| b_i - A x_i ||_1) 3.226982e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.142798e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.411967e-08 max(|| b_i - A x_i ||_1) 3.226982e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.142798e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.411967e-08 max(|| b_i - A x_i ||_1) 3.226982e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.142798e-01 (SUCCESS) Start 3321: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrrtbegin 2997/3626 Test #3322: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrrtend .................***Timeout 588.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.254463e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.612043e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.627551e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.617387e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.117994e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.233404e+00 s Time to initialize coeftab 2.422838e-01 s Time to factorize 2.335559e+01 s ( 1.71 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 4.535327e+00 s - iteration 1 : total iteration time 9.46 s error 5.1244e-13 Time for refinement 2.107154e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.107535e-08 max(|| b_i - A x_i ||_1) 3.044525e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.682397e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.107535e-08 max(|| b_i - A x_i ||_1) 3.044525e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.682397e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.107535e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.107535e-08 max(|| b_i - A x_i ||_1) 3.044525e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.682397e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.044525e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.682397e-01 (SUCCESS) Start 3322: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrrtend 2997/3626 Test #3323: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrrtbegin ..............***Timeout 589.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.415653e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.910847e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.195046e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.401558e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.510532e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.058219e-02 s Time to initialize coeftab 2.101059e+00 s Time to factorize 5.345471e+01 s (765.68 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.133391e+01 s - iteration 1 : total iteration time 13.2 s error 1.1264e-11 Time for refinement 2.641223e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.471037e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.471037e-08 max(|| b_i - A x_i ||_1) 3.194134e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.059910e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.471037e-08 max(|| b_i - A x_i ||_1) 3.194134e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.059910e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.194134e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.059910e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.471037e-08 max(|| b_i - A x_i ||_1) 3.194134e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.059910e-01 (SUCCESS) Start 3323: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrrtbegin 2997/3626 Test #3324: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrrtend ................***Timeout 589.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.037021e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.173883e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.742498e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.590453e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.204285e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.893855e-01 s Time to initialize coeftab 6.819953e-01 s Time to factorize 1.240690e+01 s ( 3.22 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 9.531968e+00 s - iteration 1 : total iteration time 4.57 s error 8.9208e-13 Time for refinement 8.575999e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.197109e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.197109e-08 max(|| b_i - A x_i ||_1) 3.071052e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.749334e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.071052e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.749334e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.197109e-08 max(|| b_i - A x_i ||_1) 3.071052e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.749334e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.197109e-08 max(|| b_i - A x_i ||_1) 3.071052e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.749334e-01 (SUCCESS) Start 3324: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrrtend 2997/3626 Test #3327: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpilu0 ...............***Timeout 602.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.107662e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.074959e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.649854e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.669052e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.309207e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.480832e-01 s Time to initialize coeftab 3.752258e-01 s Time to factorize 3.442316e+01 s ( 1.16 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.708250e+00 s - iteration 1 : total iteration time 11.5 s error 1.3521e-11 Time for refinement 2.227360e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.516193e-08 max(|| b_i - A x_i ||_1) 3.216101e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.115342e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.516193e-08 max(|| b_i - A x_i ||_1) 3.216101e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.115342e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.516193e-08 max(|| b_i - A x_i ||_1) 3.216101e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.115342e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.516193e-08 max(|| b_i - A x_i ||_1) 3.216101e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.115342e-01 (SUCCESS) Start 3327: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpilu0 2997/3626 Test #3328: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpilu1 ...............***Timeout 602.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.545513e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.352301e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.334379e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.032119e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.844403e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.508720e-01 s Time to initialize coeftab 7.826340e-02 s Time to factorize 9.375985e+00 s ( 4.26 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 7.650948e+00 s - iteration 1 : total iteration time 3.24 s error 7.0148e-12 Time for refinement 5.575346e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603448e-08 max(|| b_i - A x_i ||_1) 3.219306e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.123429e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603448e-08 max(|| b_i - A x_i ||_1) 3.219306e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.123429e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603448e-08 max(|| b_i - A x_i ||_1) 3.219306e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.123429e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.603448e-08 max(|| b_i - A x_i ||_1) 3.219306e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.123429e-01 (SUCCESS) Start 3328: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpilu1 2997/3626 Test #3330: mpi_dst_example_simple_lap_c_facto3_sched4_not_svdend ...................***Timeout 602.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.044145e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.996731e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.280181e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.607340e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.270745e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.358548e-01 s Time to initialize coeftab 1.553941e-01 s Time to factorize 5.353110e+01 s (387.95 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.898282e+01 s Time for refinement 6.018858e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.920935e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.920935e-07 max(|| b_i - A x_i ||_1) 8.736381e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.204493e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.920935e-07 max(|| b_i - A x_i ||_1) 8.736381e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.204493e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.920935e-07 max(|| b_i - A x_i ||_1) 8.736381e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.204493e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.736381e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.204493e+00 (SUCCESS) Start 3330: mpi_dst_example_simple_lap_c_facto3_sched4_not_svdend 2997/3626 Test #3334: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_svdend .......***Timeout 603.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.131508e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.296351e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.141655e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.586834e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.513410e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.050253e+00 s Time to initialize coeftab 1.113235e+00 s Time to factorize 7.542149e+01 s (275.35 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.134390e+00 s Time for refinement 2.229367e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.930983e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.930983e-07 max(|| b_i - A x_i ||_1) 8.654089e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.183728e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.654089e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.183728e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.930983e-07 max(|| b_i - A x_i ||_1) 8.654089e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.183728e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.930983e-07 max(|| b_i - A x_i ||_1) 8.654089e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.183728e+00 (SUCCESS) Start 3334: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_svdend 2997/3626 Test #3335: mpi_dst_example_simple_lap_c_facto3_sched4_not_pqrcpbegin ...............***Timeout 603.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.291387e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.498646e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.066664e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.696865e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.956928e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.509695e-03 s Time to initialize coeftab 2.066232e-01 s Time to factorize 1.868747e+01 s ( 1.09 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.419756e+00 s - iteration 1 : total iteration time 11.7 s error 5.5768e-11 Time for refinement 1.562129e+01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.673748e-08 max(|| b_i - A x_i ||_1) 3.295075e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.314621e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.673748e-08 max(|| b_i - A x_i ||_1) 3.295075e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.314621e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.673748e-08 max(|| b_i - A x_i ||_1) 3.295075e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.314621e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.673748e-08 max(|| b_i - A x_i ||_1) 3.295075e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.314621e-01 (SUCCESS) Start 3335: mpi_dst_example_simple_lap_c_facto3_sched4_not_pqrcpbegin 2997/3626 Test #3336: mpi_dst_example_simple_lap_c_facto3_sched4_not_pqrcpend .................***Timeout 603.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.243344e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.425394e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.567848e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.607411e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.889090e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.378764e+00 s Time to initialize coeftab 1.472649e+00 s Time to factorize 2.816352e+01 s (737.39 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.167296e+01 s Time for refinement 2.080007e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.139801e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.139801e-07 max(|| b_i - A x_i ||_1) 1.168694e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.949020e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.168694e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.949020e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.139801e-07 max(|| b_i - A x_i ||_1) 1.168694e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.949020e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.139801e-07 max(|| b_i - A x_i ||_1) 1.168694e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.949020e+00 (SUCCESS) Start 3336: mpi_dst_example_simple_lap_c_facto3_sched4_not_pqrcpend 2997/3626 Test #3337: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpbegin ..............***Timeout 603.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 2: 200 760 1: 300 1140 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.182860e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.274342e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.140996e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.705362e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.125331e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.071773e-01 s Time to initialize coeftab 3.157081e-01 s Time to factorize 2.434002e+01 s (853.23 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.138415e+01 s - iteration 1 : total iteration time 4.81 s error 6.7024e-11 Time for refinement 8.449684e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.434505e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.434505e-08 max(|| b_i - A x_i ||_1) 3.248681e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.197552e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.434505e-08 max(|| b_i - A x_i ||_1) 3.248681e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.197552e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.248681e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.197552e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.434505e-08 max(|| b_i - A x_i ||_1) 3.248681e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.197552e-01 (SUCCESS) Start 3337: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpbegin 2997/3626 Test #3338: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpend ................***Timeout 603.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.819436e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.338685e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.709871e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.599696e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.450372e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.719657e-01 s Time to initialize coeftab 6.868283e-01 s Time to factorize 2.193022e+01 s (946.99 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.071176e+01 s Time for refinement 1.198547e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.037814e-07 max(|| b_i - A x_i ||_1) 1.171809e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.956881e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.037814e-07 max(|| b_i - A x_i ||_1) 1.171809e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.956881e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.037814e-07 max(|| b_i - A x_i ||_1) 1.171809e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.956881e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.037814e-07 max(|| b_i - A x_i ||_1) 1.171809e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.956881e+00 (SUCCESS) Start 3338: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpend 2997/3626 Test #3339: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpbegin ...***Timeout 603.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.956970e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.135018e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.395841e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.264179e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.924494e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.819905e-01 s Time to initialize coeftab 7.189989e-01 s Time to factorize 3.975439e+01 s (522.40 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.373096e+00 s - iteration 1 : total iteration time 11.8 s error 2.2422e-11 Time for refinement 2.530666e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.379721e-08 max(|| b_i - A x_i ||_1) 3.235671e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.164724e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.379721e-08 max(|| b_i - A x_i ||_1) 3.235671e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.164724e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.379721e-08 max(|| b_i - A x_i ||_1) 3.235671e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.164724e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.379721e-08 max(|| b_i - A x_i ||_1) 3.235671e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.164724e-01 (SUCCESS) Start 3339: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpbegin 2997/3626 Test #3340: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpend .....***Timeout 602.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.232628e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.122285e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.526774e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.679758e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.004800e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.647638e+00 s Time to initialize coeftab 4.006891e-01 s Time to factorize 1.376463e+01 s ( 1.47 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.991987e+00 s Time for refinement 2.205231e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.707335e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.707335e-07 max(|| b_i - A x_i ||_1) 1.376446e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.473253e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.376446e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.473253e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.707335e-07 max(|| b_i - A x_i ||_1) 1.376446e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.473253e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.707335e-07 max(|| b_i - A x_i ||_1) 1.376446e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.473253e+00 (SUCCESS) Start 3340: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpend 2997/3626 Test #3341: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrcpbegin ...............***Timeout 602.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.031734e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.795708e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.429110e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.049416e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.303598e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.752558e-01 s Time to initialize coeftab 1.934677e+00 s Time to factorize 1.084258e+02 s (191.54 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.373160e+00 s - iteration 1 : total iteration time 11.5 s error 2.6952e-11 Time for refinement 2.002413e+01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.486580e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.486580e-08 max(|| b_i - A x_i ||_1) 3.279640e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.275674e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.279640e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.275674e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.486580e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.486580e-08 max(|| b_i - A x_i ||_1) 3.279640e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.275674e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.279640e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.275674e-01 (SUCCESS) Start 3341: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrcpbegin 2997/3626 Test #3342: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrcpend .................***Timeout 602.89 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.058337e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.291168e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.400160e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.007459e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.876269e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.161592e-01 s Time to initialize coeftab 1.516225e+00 s Time to factorize 1.258619e+01 s ( 1.61 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.587267e+00 s Time for refinement 7.409603e-01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.175336e-07 max(|| b_i - A x_i ||_1) 1.184864e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.989823e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.175336e-07 max(|| b_i - A x_i ||_1) 1.184864e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.989823e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.175336e-07 max(|| b_i - A x_i ||_1) 1.184864e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.989823e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.175336e-07 max(|| b_i - A x_i ||_1) 1.184864e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.989823e+00 (SUCCESS) Start 3342: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrcpend 2997/3626 Test #3344: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrcpend ................***Timeout 603.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.473845e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.579960e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.895282e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.670580e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.557461e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.794025e-02 s Time to initialize coeftab 2.168858e-01 s Time to factorize 2.482833e+01 s (836.45 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.208571e+00 s Time for refinement 4.138892e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.209769e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.209769e-07 max(|| b_i - A x_i ||_1) 1.193507e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.011632e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.193507e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.011632e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.209769e-07 max(|| b_i - A x_i ||_1) 1.193507e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.011632e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.209769e-07 max(|| b_i - A x_i ||_1) 1.193507e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.011632e+00 (SUCCESS) Start 3344: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrcpend 2997/3626 Test #3346: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpend .....***Timeout 613.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.980806e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.348805e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.387194e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.261370e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.150009e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.229338e+00 s Time to initialize coeftab 7.670704e-01 s Time to factorize 5.082676e+01 s (408.60 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.322742e+01 s Time for refinement 3.088371e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.144237e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.144237e-07 max(|| b_i - A x_i ||_1) 1.185927e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.992506e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.144237e-07 max(|| b_i - A x_i ||_1) 1.185927e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.992506e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.144237e-07 max(|| b_i - A x_i ||_1) 1.185927e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.992506e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.185927e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.992506e+00 (SUCCESS) Start 3346: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpend 2997/3626 Test #3347: mpi_dst_example_simple_lap_c_facto3_sched4_not_tqrcpbegin ...............***Timeout 612.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.464282e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.522041e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.686195e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.937028e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.704126e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.288151e-01 s Time to initialize coeftab 1.475349e+00 s Time to factorize 6.270262e+01 s (331.21 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.114304e+01 s - iteration 1 : total iteration time 5.08 s error 5.4004e-11 Time for refinement 9.663140e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.616455e-08 max(|| b_i - A x_i ||_1) 3.317739e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.371809e-01 (SUCCESS) || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.616455e-08 max(|| b_i - A x_i ||_1) 3.317739e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.371809e-01 (SUCCESS) || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.616455e-08 max(|| b_i - A x_i ||_1) 3.317739e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.371809e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.616455e-08 max(|| b_i - A x_i ||_1) 3.317739e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.371809e-01 (SUCCESS) Start 3347: mpi_dst_example_simple_lap_c_facto3_sched4_not_tqrcpbegin 2997/3626 Test #3348: mpi_dst_example_simple_lap_c_facto3_sched4_not_tqrcpend .................***Timeout 612.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.425980e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.396579e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.786919e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.188164e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.960883e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.920360e+00 s Time to initialize coeftab 1.590982e+00 s Time to factorize 4.706147e+01 s (441.29 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.013147e+00 s - iteration 1 : total iteration time 11 s error 8.5211e-12 Time for refinement 1.963467e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.227302e-08 max(|| b_i - A x_i ||_1) 3.158403e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.969748e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.227302e-08 max(|| b_i - A x_i ||_1) 3.158403e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.969748e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.227302e-08 max(|| b_i - A x_i ||_1) 3.158403e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.969748e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.227302e-08 max(|| b_i - A x_i ||_1) 3.158403e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.969748e-01 (SUCCESS) Start 3348: mpi_dst_example_simple_lap_c_facto3_sched4_not_tqrcpend 2997/3626 Test #3350: mpi_dst_example_simple_lap_c_facto3_sched4_kway_tqrcpend ................***Timeout 612.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.705044e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.001138e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.296001e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.987000e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.255854e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.044444e+00 s Time to initialize coeftab 6.501179e-01 s Time to factorize 2.457933e+01 s (844.92 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.388603e+00 s Time for refinement 6.543857e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.147443e-07 max(|| b_i - A x_i ||_1) 1.167083e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.944957e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.147443e-07 max(|| b_i - A x_i ||_1) 1.167083e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.944957e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.147443e-07 max(|| b_i - A x_i ||_1) 1.167083e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.944957e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.147443e-07 max(|| b_i - A x_i ||_1) 1.167083e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.944957e+00 (SUCCESS) Start 3350: mpi_dst_example_simple_lap_c_facto3_sched4_kway_tqrcpend 2997/3626 Test #3352: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpend .....***Timeout 613.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.433401e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.966284e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.914821e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.154834e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.558697e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.089550e-02 s Time to initialize coeftab 1.369899e-01 s Time to factorize 2.293546e+01 s (905.48 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.187662e+00 s - iteration 1 : total iteration time 4.55 s error 2.7171e-11 Time for refinement 9.391116e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.197693e-08 max(|| b_i - A x_i ||_1) 3.179604e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.023248e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.197693e-08 max(|| b_i - A x_i ||_1) 3.179604e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.023248e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.197693e-08 max(|| b_i - A x_i ||_1) 3.179604e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.023248e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.197693e-08 max(|| b_i - A x_i ||_1) 3.179604e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.023248e-01 (SUCCESS) Start 3352: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpend 2997/3626 Test #3353: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrrtbegin ...............***Timeout 612.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.613612e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.183476e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.997058e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.403032e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.065562e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.591517e-01 s Time to initialize coeftab 2.126903e+00 s Time to factorize 6.820692e+01 s (304.48 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.571205e+01 s - iteration 1 : total iteration time 5.2 s error 6.0859e-11 Time for refinement 9.099106e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.670695e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.670695e-08 max(|| b_i - A x_i ||_1) 3.350215e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.453759e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.350215e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.453759e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.670695e-08 max(|| b_i - A x_i ||_1) 3.350215e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.453759e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.670695e-08 max(|| b_i - A x_i ||_1) 3.350215e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.453759e-01 (SUCCESS) Start 3353: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrrtbegin 2997/3626 Test #3354: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrrtend .................***Timeout 612.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.893149e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.745660e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.639815e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.929803e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.750799e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.746477e-02 s Time to initialize coeftab 3.897141e-01 s Time to factorize 9.708103e+01 s (213.92 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.505696e+01 s Time for refinement 4.480757e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.955151e-07 max(|| b_i - A x_i ||_1) 1.187651e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.996856e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.955151e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.955151e-07 max(|| b_i - A x_i ||_1) 1.187651e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.996856e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.955151e-07 max(|| b_i - A x_i ||_1) 1.187651e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.996856e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.187651e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.996856e+00 (SUCCESS) Start 3354: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrrtend 2997/3626 Test #3355: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrrtbegin ..............***Timeout 612.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.296962e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.849947e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.635641e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.409613e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.566285e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.339371e-01 s Time to initialize coeftab 2.032665e+00 s Time to factorize 1.491171e+02 s (139.27 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.494436e+01 s - iteration 1 : total iteration time 8.49 s error 2.8379e-11 Time for refinement 1.528022e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.503697e-08 max(|| b_i - A x_i ||_1) 3.267512e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.245070e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.503697e-08 max(|| b_i - A x_i ||_1) 3.267512e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.245070e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.503697e-08 max(|| b_i - A x_i ||_1) 3.267512e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.245070e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.503697e-08 max(|| b_i - A x_i ||_1) 3.267512e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.245070e-01 (SUCCESS) Start 3355: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrrtbegin 2997/3626 Test #3356: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrrtend ................***Timeout 612.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.554965e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.512818e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.789984e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.540114e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.143669e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.533888e-01 s Time to initialize coeftab 4.054020e-01 s Time to factorize 3.267904e+01 s (635.50 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.062428e+01 s Time for refinement 4.668223e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.400274e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.400274e-07 max(|| b_i - A x_i ||_1) 1.336948e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.373585e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.400274e-07 max(|| b_i - A x_i ||_1) 1.336948e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.373585e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.400274e-07 max(|| b_i - A x_i ||_1) 1.336948e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.373585e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.336948e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.373585e+00 (SUCCESS) Start 3356: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrrtend 2997/3626 Test #3357: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtbegin ...***Timeout 611.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.555620e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.856838e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.422051e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.442153e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.910948e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.327414e+00 s Time to initialize coeftab 3.895480e+00 s Time to factorize 5.117448e+01 s (405.82 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.939340e+00 s - iteration 1 : total iteration time 10.3 s error 5.538e-11 Time for refinement 1.719981e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.751281e-08 max(|| b_i - A x_i ||_1) 3.429291e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.653294e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.751281e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.751281e-08 max(|| b_i - A x_i ||_1) 3.429291e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.653294e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.429291e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.653294e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.751281e-08 max(|| b_i - A x_i ||_1) 3.429291e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.653294e-01 (SUCCESS) Start 3357: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtbegin 2997/3626 Test #3358: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtend .....***Timeout 610.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.093454e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.080792e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.036107e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.422940e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.693883e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.278368e+00 s Time to initialize coeftab 1.386159e+00 s Time to factorize 4.080191e+01 s (508.99 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.721341e+00 s - iteration 1 : total iteration time 6.33 s error 1.3945e-11 Time for refinement 1.024730e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.144771e-08 max(|| b_i - A x_i ||_1) 3.131570e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.902040e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.144771e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.144771e-08 max(|| b_i - A x_i ||_1) 3.131570e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.902040e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.131570e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.902040e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.144771e-08 max(|| b_i - A x_i ||_1) 3.131570e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.902040e-01 (SUCCESS) Start 3358: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtend 2997/3626 Test #3359: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpilu0 ...............***Timeout 610.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.121341e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.725545e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.079732e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.175059e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.609829e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.146086e+00 s Time to initialize coeftab 1.467001e+00 s Time to factorize 3.016924e+01 s (688.37 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.517106e+01 s - iteration 1 : total iteration time 22.5 s error 4.3292e-11 Time for refinement 3.043763e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.554000e-08 max(|| b_i - A x_i ||_1) 3.253967e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.210891e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.554000e-08 max(|| b_i - A x_i ||_1) 3.253967e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.210891e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.554000e-08 max(|| b_i - A x_i ||_1) 3.253967e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.210891e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.554000e-08 max(|| b_i - A x_i ||_1) 3.253967e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.210891e-01 (SUCCESS) Start 3359: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpilu0 2997/3626 Test #3360: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpilu1 ...............***Timeout 610.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.733522e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.740732e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.902904e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.021887e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.947251e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.654102e-01 s Time to initialize coeftab 7.191610e-02 s Time to factorize 8.985210e+01 s (231.13 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.989967e+00 s - iteration 1 : total iteration time 19 s error 1.0951e-11 Time for refinement 2.620007e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.178649e-08 max(|| b_i - A x_i ||_1) 3.134795e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.910179e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.178649e-08 max(|| b_i - A x_i ||_1) 3.134795e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.910179e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.178649e-08 max(|| b_i - A x_i ||_1) 3.134795e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.910179e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.178649e-08 max(|| b_i - A x_i ||_1) 3.134795e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.910179e-01 (SUCCESS) Start 3360: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpilu1 2997/3626 Test #3364: mpi_dst_example_simple_lap_c_facto4_sched4_kway_svdend ..................***Timeout 606.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.158133e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.936788e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.745730e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.764887e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.459018e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.036042e-02 s Time to initialize coeftab 1.118059e+00 s Time to factorize 9.469456e+01 s (230.42 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.659198e+00 s Time for refinement 6.213053e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.825816e-07 max(|| b_i - A x_i ||_1) 7.959248e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.008395e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.825816e-07 max(|| b_i - A x_i ||_1) 7.959248e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.008395e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.825816e-07 max(|| b_i - A x_i ||_1) 7.959248e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.008395e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.825816e-07 max(|| b_i - A x_i ||_1) 7.959248e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.008395e+00 (SUCCESS) Start 3364: mpi_dst_example_simple_lap_c_facto4_sched4_kway_svdend 2997/3626 Test #3366: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_svdend .......***Timeout 606.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.938549e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.616430e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.076398e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.599900e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.133449e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.520505e-02 s Time to initialize coeftab 2.420826e-01 s Time to factorize 1.967083e+01 s ( 1.08 MFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.794891e+00 s Time for refinement 5.114313e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.865082e-07 max(|| b_i - A x_i ||_1) 8.083524e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.039754e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.865082e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.865082e-07 max(|| b_i - A x_i ||_1) 8.083524e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.039754e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.083524e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.039754e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.865082e-07 max(|| b_i - A x_i ||_1) 8.083524e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.039754e+00 (SUCCESS) Start 3366: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_svdend 2997/3626 Test #3368: mpi_dst_example_simple_lap_c_facto4_sched4_not_pqrcpend .................***Timeout 606.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.352850e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.787251e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.015747e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.002341e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.737598e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 8.600753e-01 s Time to initialize coeftab 6.169863e-01 s Time to factorize 5.590450e+01 s (390.30 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.077456e+00 s Time for refinement 4.182733e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.844734e-07 max(|| b_i - A x_i ||_1) 7.978782e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.013324e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.844734e-07 max(|| b_i - A x_i ||_1) 7.978782e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.013324e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.844734e-07 max(|| b_i - A x_i ||_1) 7.978782e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.013324e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.844734e-07 max(|| b_i - A x_i ||_1) 7.978782e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.013324e+00 (SUCCESS) Start 3368: mpi_dst_example_simple_lap_c_facto4_sched4_not_pqrcpend 2997/3626 Test #3370: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpend ................***Timeout 606.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.885796e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.957335e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.739184e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.752858e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.103403e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.259085e+00 s Time to initialize coeftab 8.387290e-01 s Time to factorize 4.120617e+01 s (529.52 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.646368e+00 s Time for refinement 4.767154e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.046283e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.046283e-07 max(|| b_i - A x_i ||_1) 9.751899e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.460743e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.046283e-07 max(|| b_i - A x_i ||_1) 9.751899e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.460743e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.751899e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.460743e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.046283e-07 max(|| b_i - A x_i ||_1) 9.751899e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.460743e+00 (SUCCESS) Start 3370: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpend 2997/3626 Test #3372: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpend .....***Timeout 606.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.139474e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.334023e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.170958e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.282882e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.703520e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.995679e-01 s Time to initialize coeftab 1.718576e-01 s Time to factorize 3.386959e+01 s (644.22 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.935952e+00 s Time for refinement 3.692734e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.140137e-07 max(|| b_i - A x_i ||_1) 1.247052e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.146745e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.140137e-07 max(|| b_i - A x_i ||_1) 1.247052e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.146745e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.140137e-07 max(|| b_i - A x_i ||_1) 1.247052e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.146745e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 8.140137e-07 max(|| b_i - A x_i ||_1) 1.247052e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.146745e+00 (SUCCESS) Start 3372: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpend 2997/3626 Test #3376: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrcpend ................***Timeout 605.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.709412e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.565753e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.065024e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.860753e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.412278e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.793214e+00 s Time to initialize coeftab 8.831553e-01 s Time to factorize 1.130861e+02 s (192.94 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3376: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrcpend 2997/3626 Test #3378: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpend .....***Timeout 604.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 2: 200 760 1: 300 1140 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.732558e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.756988e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.605465e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.429869e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.877247e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.106725e+00 s Time to initialize coeftab 5.778902e-01 s Time to factorize 7.111821e+01 s (306.80 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 3378: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpend 2997/3626 Test #3380: mpi_dst_example_simple_lap_c_facto4_sched4_not_tqrcpend .................***Timeout 604.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.358024e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.982306e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.580500e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.109671e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.898748e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 8.423521e-01 s Time to initialize coeftab 4.269515e-01 s Time to factorize 7.071654e+01 s (308.55 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.916607e+01 s Time for refinement 6.272291e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.828042e-07 max(|| b_i - A x_i ||_1) 7.949425e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.005917e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.828042e-07 max(|| b_i - A x_i ||_1) 7.949425e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.005917e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.828042e-07 max(|| b_i - A x_i ||_1) 7.949425e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.005917e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.828042e-07 max(|| b_i - A x_i ||_1) 7.949425e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.005917e+00 (SUCCESS) Start 3380: mpi_dst_example_simple_lap_c_facto4_sched4_not_tqrcpend 2997/3626 Test #3382: mpi_dst_example_simple_lap_c_facto4_sched4_kway_tqrcpend ................***Timeout 603.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.444100e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.165326e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.599468e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.200096e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.213687e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.604443e+00 s Time to initialize coeftab 2.831069e-01 s Time to factorize 5.211626e+01 s (418.67 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.086397e+01 s Time for refinement 3.229871e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.033653e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.033653e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.033653e-07 max(|| b_i - A x_i ||_1) 1.378404e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.478193e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.033653e-07 max(|| b_i - A x_i ||_1) 1.378404e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.478193e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.378404e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.478193e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.378404e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.478193e+00 (SUCCESS) Start 3382: mpi_dst_example_simple_lap_c_facto4_sched4_kway_tqrcpend 2997/3626 Test #3384: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpend .....***Timeout 603.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.129692e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.428043e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.172786e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.489116e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.527562e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 8.214170e-01 s Time to initialize coeftab 1.359385e+00 s Time to factorize 3.383496e+01 s (644.88 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.149738e+00 s Time for refinement 3.831978e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.979330e-07 max(|| b_i - A x_i ||_1) 8.245750e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.080690e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.979330e-07 max(|| b_i - A x_i ||_1) 8.245750e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.080690e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.979330e-07 max(|| b_i - A x_i ||_1) 8.245750e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.080690e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.979330e-07 max(|| b_i - A x_i ||_1) 8.245750e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.080690e+00 (SUCCESS) Start 3384: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpend 2997/3626 Test #3386: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrrtend .................***Timeout 602.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.922245e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.166485e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.112924e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.717852e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.380451e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 7.723213e-01 s Time to initialize coeftab 1.006118e+00 s Time to factorize 4.876611e+01 s (447.43 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.215725e+01 s Time for refinement 4.028413e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.177143e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.177143e-07 max(|| b_i - A x_i ||_1) 1.135866e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.866186e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.177143e-07 max(|| b_i - A x_i ||_1) 1.135866e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.866186e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.177143e-07 max(|| b_i - A x_i ||_1) 1.135866e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.866186e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.135866e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.866186e+00 (SUCCESS) Start 3386: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrrtend 2997/3626 Test #3388: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrrtend ................***Timeout 601.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.765851e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.182273e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.075058e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.260769e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.328051e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.178504e-01 s Time to initialize coeftab 1.809933e-01 s Time to factorize 6.725024e+01 s (324.45 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.614167e+01 s Time for refinement 9.835140e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.848051e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.848051e-07 max(|| b_i - A x_i ||_1) 1.554879e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.923500e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.848051e-07 max(|| b_i - A x_i ||_1) 1.554879e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.923500e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.554879e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.923500e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.848051e-07 max(|| b_i - A x_i ||_1) 1.554879e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.923500e+00 (SUCCESS) Start 3388: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrrtend 2997/3626 Test #3390: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtend .....***Timeout 601.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.570134e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.418373e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.890031e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.972071e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.230977e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.036657e+00 s Time to initialize coeftab 5.389528e-01 s Time to factorize 5.390075e+01 s (404.81 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.062325e+01 s Time for refinement 4.508646e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.816227e-07 max(|| b_i - A x_i ||_1) 7.917023e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.997741e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.816227e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.816227e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.816227e-07 max(|| b_i - A x_i ||_1) 7.917023e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.997741e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 7.917023e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.997741e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 7.917023e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.997741e+00 (SUCCESS) Start 3390: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtend 2997/3626 Test #3391: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpilu0 ...............***Timeout 599.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 1: 300 1140 2: 200 760 3: 200 660 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.507329e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.373655e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.216623e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.229392e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.272591e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.284273e+00 s Time to initialize coeftab 7.974784e-01 s Time to factorize 8.943181e+01 s (243.98 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3391: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpilu0 2997/3626 Test #3398: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_svdend .......***Timeout 612.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.282381e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.771665e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.172914e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.875039e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.720553e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.452999e+00 s Time to initialize coeftab 1.599858e+00 s Time to factorize 4.729122e+01 s (439.14 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 5.983285e+00 s - iteration 1 : total iteration time 27.1 s error 4.1481e-15 Time for refinement 3.955047e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.151612e-15 max(|| b_i - A x_i ||_1) 4.042310e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.020012e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.151612e-15 max(|| b_i - A x_i ||_1) 4.042310e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.020012e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.151612e-15 max(|| b_i - A x_i ||_1) 4.042310e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.020012e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.151612e-15 max(|| b_i - A x_i ||_1) 4.042310e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.020012e-02 (SUCCESS) Start 3398: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_svdend 2997/3626 Test #3400: mpi_dst_example_simple_lap_z_facto0_sched4_not_pqrcpend .................***Timeout 611.79 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.383051e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.303747e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.152203e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.907972e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.913951e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.299007e+00 s Time to initialize coeftab 1.599338e+00 s Time to factorize 2.100306e+01 s (988.79 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 8.379679e+00 s - iteration 1 : total iteration time 10.5 s error 3.2895e-16 Time for refinement 1.982646e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.546996e-16 max(|| b_i - A x_i ||_1) 8.190054e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.066628e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.546996e-16 max(|| b_i - A x_i ||_1) 8.190054e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.066628e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.546996e-16 max(|| b_i - A x_i ||_1) 8.190054e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.066628e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.546996e-16 max(|| b_i - A x_i ||_1) 8.190054e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.066628e-03 (SUCCESS) Start 3400: mpi_dst_example_simple_lap_z_facto0_sched4_not_pqrcpend 2997/3626 Test #3402: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpend ................***Timeout 610.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.849246e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.554878e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.769824e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.925909e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.409586e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.352044e-01 s Time to initialize coeftab 1.417170e+00 s Time to factorize 5.805177e+01 s (357.74 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 9.357792e+00 s - iteration 1 : total iteration time 28.4 s error 2.8168e-15 Time for refinement 4.113050e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.817680e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.817680e-15 max(|| b_i - A x_i ||_1) 2.805315e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.078760e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.805315e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.078760e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.817680e-15 max(|| b_i - A x_i ||_1) 2.805315e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.078760e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.817680e-15 max(|| b_i - A x_i ||_1) 2.805315e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.078760e-03 (SUCCESS) Start 3402: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpend 2997/3626 Test #3404: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpend .....***Timeout 608.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.167757e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.413765e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.186678e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.002938e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.304156e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.504994e-01 s Time to initialize coeftab 5.663868e-01 s Time to factorize 1.129755e+01 s ( 1.80 MFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.690321e+01 s - iteration 1 : total iteration time 11.3 s error 4.131e-16 Time for refinement 2.008489e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.318242e-16 max(|| b_i - A x_i ||_1) 9.883491e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.493940e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.318242e-16 max(|| b_i - A x_i ||_1) 9.883491e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.493940e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.318242e-16 max(|| b_i - A x_i ||_1) 9.883491e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.493940e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.318242e-16 max(|| b_i - A x_i ||_1) 9.883491e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.493940e-03 (SUCCESS) Start 3404: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpend 2997/3626 Test #3406: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrcpend .................***Timeout 607.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.335143e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.266774e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.061291e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.345867e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.899843e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.486815e+00 s Time to initialize coeftab 8.111984e-01 s Time to factorize 7.062301e+01 s (294.06 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 1.956851e+01 s - iteration 1 : total iteration time 11.1 s error 5.6002e-16 Time for refinement 1.815509e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.758889e-16 max(|| b_i - A x_i ||_1) 8.997151e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.270286e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.758889e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.758889e-16 max(|| b_i - A x_i ||_1) 8.997151e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.270286e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.997151e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.270286e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.758889e-16 max(|| b_i - A x_i ||_1) 8.997151e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.270286e-03 (SUCCESS) Start 3406: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrcpend 2997/3626 Test #3408: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrcpend ................***Timeout 605.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.055881e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.311319e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.470169e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.651023e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.502131e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.418830e-01 s Time to initialize coeftab 6.464977e-01 s Time to factorize 1.082796e+02 s (191.80 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 3408: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrcpend 2997/3626 Test #3410: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpend .....***Timeout 603.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.314663e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.209633e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.531142e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.811993e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.842635e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.245687e+00 s Time to initialize coeftab 7.293002e-01 s Time to factorize 3.393996e+01 s (611.89 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 3410: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpend Test #3087: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrcpbegin ..............***Timeout 548.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.010045e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.736111e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.517258e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.057072e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.790815e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.102235e+00 s Time to initialize coeftab 2.765047e+00 s Time to factorize 9.003464e+01 s (59.52 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Start 3442: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpend Start 3443: mpi_dst_example_simple_lap_z_facto1_sched4_not_tqrcpbegin Start 3444: mpi_dst_example_simple_lap_z_facto1_sched4_not_tqrcpend Start 3445: mpi_dst_example_simple_lap_z_facto1_sched4_kway_tqrcpbegin Start 3446: mpi_dst_example_simple_lap_z_facto1_sched4_kway_tqrcpend Start 3447: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpbegin Start 3448: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpend Start 3449: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrrtbegin Start 3450: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrrtend Start 3451: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrrtbegin Start 3452: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrrtend Start 3453: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtbegin Start 3454: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtend Start 3455: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpilu0 Start 3456: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpilu1 Start 3457: mpi_dst_example_simple_lap_z_facto2_sched4_not_svdbegin Start 3458: mpi_dst_example_simple_lap_z_facto2_sched4_not_svdend Start 3459: mpi_dst_example_simple_lap_z_facto2_sched4_kway_svdbegin Start 3460: mpi_dst_example_simple_lap_z_facto2_sched4_kway_svdend Start 3461: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_svdbegin Start 3462: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_svdend Start 3463: mpi_dst_example_simple_lap_z_facto2_sched4_not_pqrcpbegin Start 3464: mpi_dst_example_simple_lap_z_facto2_sched4_not_pqrcpend Start 3465: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpbegin Start 3466: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpend Start 3467: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpbegin Start 3468: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpend Start 3469: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrcpbegin Start 3470: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrcpend Start 3471: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrcpbegin Start 3472: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrcpend Start 3473: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpbegin Start 3474: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpend Start 3475: mpi_dst_example_simple_lap_z_facto2_sched4_not_tqrcpbegin Start 3476: mpi_dst_example_simple_lap_z_facto2_sched4_not_tqrcpend Start 3477: mpi_dst_example_simple_lap_z_facto2_sched4_kway_tqrcpbegin Start 3478: mpi_dst_example_simple_lap_z_facto2_sched4_kway_tqrcpend Start 3479: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpbegin Start 3480: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpend Start 3481: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrrtbegin Start 3482: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrrtend Start 3483: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrrtbegin Start 3484: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrrtend Start 3485: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtbegin Start 3486: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtend Start 3487: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpilu0 Start 3488: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpilu1 Start 3489: mpi_dst_example_simple_lap_z_facto3_sched4_not_svdbegin Start 3490: mpi_dst_example_simple_lap_z_facto3_sched4_not_svdend Start 3491: mpi_dst_example_simple_lap_z_facto3_sched4_kway_svdbegin Start 3492: mpi_dst_example_simple_lap_z_facto3_sched4_kway_svdend Start 3493: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_svdbegin Start 3494: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_svdend Start 3495: mpi_dst_example_simple_lap_z_facto3_sched4_not_pqrcpbegin Start 3496: mpi_dst_example_simple_lap_z_facto3_sched4_not_pqrcpend Start 3497: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpbegin Start 3498: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpend Start 3499: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpbegin Start 3500: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpend Start 3501: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrcpbegin Start 3502: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrcpend Start 3503: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrcpbegin Start 3504: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrcpend Start 3505: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpbegin Start 3506: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpend Start 3507: mpi_dst_example_simple_lap_z_facto3_sched4_not_tqrcpbegin Start 3508: mpi_dst_example_simple_lap_z_facto3_sched4_not_tqrcpend Start 3509: mpi_dst_example_simple_lap_z_facto3_sched4_kway_tqrcpbegin Test #2904: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpend .....***Timeout 707.64 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.877269e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.028613e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.469321e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.634632e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.954205e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.793734e+00 s Time to initialize coeftab 4.761250e-01 s Time to factorize 1.305276e+02 s (159.11 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Test #2907: mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrrtbegin ..............***Timeout 707.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.306811e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.363826e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.728653e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.386175e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.216214e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.002054e+00 s Time to initialize coeftab 7.397863e+00 s Time to factorize 9.854783e+01 s (210.74 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.227416e+00 s Test #2910: mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtend .....***Timeout 707.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.350095e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.664023e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.760861e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.248744e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.995086e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.170223e+00 s Time to initialize coeftab 1.063698e+00 s Time to factorize 6.138758e+01 s (338.30 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Test #2912: mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpilu1 ...............***Timeout 707.58 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.583810e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.600914e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.422364e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.563182e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.977845e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.944940e+00 s Time to initialize coeftab 1.044584e+00 s Time to factorize 8.624000e+01 s (240.81 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Test #2915: mpi_dst_example_simple_lap_z_facto1_sched1_kway_svdbegin ................***Timeout 707.56 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.148244e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.793875e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.224713e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.205975e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.652716e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.758100e-01 s Time to initialize coeftab 3.551959e+00 s Time to factorize 1.832241e+02 s (119.09 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Test #2916: mpi_dst_example_simple_lap_z_facto1_sched1_kway_svdend ..................***Timeout 707.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.925808e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.158912e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.858876e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.535558e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.554847e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.349947e+00 s Time to initialize coeftab 7.322422e-01 s Test #3063: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_tqrcpbegin ...***Timeout 707.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.611410e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.686935e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.085215e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.491271e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.428913e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.794536e+00 s Time to initialize coeftab 2.986383e+00 s Time to factorize 8.079015e+01 s (64.16 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Time to solve 6.882093e+00 s Start 3063: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_tqrcpbegin Test #3076: mpi_dst_example_simple_lap_s_facto1_sched4_kway_svdend ..................***Timeout 707.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.030725e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.731821e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.145675e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.857162e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.652579e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.503679e+00 s Start 3076: mpi_dst_example_simple_lap_s_facto1_sched4_kway_svdend Test #3077: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_svdbegin .....***Timeout 707.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.670024e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.416075e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.426871e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.231510e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.162928e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.656065e+00 s Start 3077: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_svdbegin Test #3078: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_svdend .......***Timeout 707.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.777356e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.150162e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.102561e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.921999e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.377575e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.753316e+00 s Time to initialize coeftab 7.400676e-01 s Start 3078: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_svdend Test #3079: mpi_dst_example_simple_lap_s_facto1_sched4_not_pqrcpbegin ...............***Timeout 707.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.240278e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.194097e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.469727e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3079: mpi_dst_example_simple_lap_s_facto1_sched4_not_pqrcpbegin Test #3085: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrcpbegin ...............***Timeout 707.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3085: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrcpbegin Test #3091: mpi_dst_example_simple_lap_s_facto1_sched4_not_tqrcpbegin ...............***Timeout 707.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3091: mpi_dst_example_simple_lap_s_facto1_sched4_not_tqrcpbegin Test #3095: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpbegin ...***Timeout 707.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.160708e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.677101e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.079985e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.472821e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.179655e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.230288e+00 s Time to initialize coeftab 2.022041e+00 s Time to factorize 1.036086e+02 s (51.72 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Start 3095: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpbegin Test #3098: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrrtend .................***Timeout 707.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.892839e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.674575e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.700901e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.848672e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.059133e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.140111e+00 s Time to initialize coeftab 5.171999e-01 s Start 3098: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrrtend Test #3110: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_svdend .......***Timeout 707.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.695771e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.531411e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.165236e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.018501e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.769721e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.960369e+00 s Time to initialize coeftab 4.744813e-01 s Start 3110: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_svdend Test #3113: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpbegin ..............***Timeout 706.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.592111e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.304088e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.922666e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.166809e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.859648e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.865892e+00 s Time to initialize coeftab 1.518008e+00 s Start 3113: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpbegin Test #3117: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrcpbegin ...............***Timeout 706.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3117: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrcpbegin Test #3119: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrcpbegin ..............***Timeout 706.61 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.444623e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.884592e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.918935e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.099741e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.697536e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.081459e+00 s Time to initialize coeftab 3.884492e+00 s Time to factorize 4.703743e+01 s (217.36 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Start 3119: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrcpbegin Test #3123: mpi_dst_example_simple_lap_s_facto2_sched4_not_tqrcpbegin ...............***Timeout 706.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3123: mpi_dst_example_simple_lap_s_facto2_sched4_not_tqrcpbegin Test #2913: mpi_dst_example_simple_lap_z_facto1_sched1_not_svdbegin .................***Timeout 706.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.125225e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.211421e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.022091e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.938811e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.820767e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.385586e+00 s Time to initialize coeftab 3.102889e+00 s Test #3127: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpbegin ...***Timeout 706.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.471235e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.510427e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.671759e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3127: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpbegin Test #3129: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrrtbegin ...............***Timeout 706.34 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.702466e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.009480e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.813882e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.393076e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.666380e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.373686e+00 s Time to initialize coeftab 2.932800e+00 s Start 3129: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrrtbegin Test #3130: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrrtend .................***Timeout 706.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3130: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrrtend Test #3137: mpi_dst_example_simple_lap_d_facto0_sched4_not_svdbegin .................***Timeout 706.32 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.129805e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.075409e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.605687e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.818468e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.461964e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.036861e+00 s Time to initialize coeftab 2.442114e+00 s Start 3137: mpi_dst_example_simple_lap_d_facto0_sched4_not_svdbegin Test #3139: mpi_dst_example_simple_lap_d_facto0_sched4_kway_svdbegin ................***Timeout 706.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.065503e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.060687e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.018330e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.069904e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.704163e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.245149e+00 s Time to initialize coeftab 4.019477e+00 s Start 3139: mpi_dst_example_simple_lap_d_facto0_sched4_kway_svdbegin Test #3140: mpi_dst_example_simple_lap_d_facto0_sched4_kway_svdend ..................***Timeout 706.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.052253e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.501809e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.272181e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.631110e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.563423e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.841622e+00 s Time to initialize coeftab 1.112105e+00 s Time to factorize 2.848774e+01 s (181.97 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 3140: mpi_dst_example_simple_lap_d_facto0_sched4_kway_svdend Test #3141: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_svdbegin .....***Timeout 706.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.062645e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.317797e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.100324e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3141: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_svdbegin Test #3151: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrcpbegin ..............***Timeout 706.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3151: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrcpbegin Test #3154: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpend .....***Timeout 707.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.701789e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.012294e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.576969e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.559178e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.057979e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Start 3154: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpend Test #3155: mpi_dst_example_simple_lap_d_facto0_sched4_not_tqrcpbegin ...............***Timeout 708.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3155: mpi_dst_example_simple_lap_d_facto0_sched4_not_tqrcpbegin Test #3157: mpi_dst_example_simple_lap_d_facto0_sched4_kway_tqrcpbegin ..............***Timeout 708.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3157: mpi_dst_example_simple_lap_d_facto0_sched4_kway_tqrcpbegin Test #2917: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_svdbegin .....***Timeout 708.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.998967e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.468024e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.029944e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.331674e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.380959e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.829277e-02 s Time to initialize coeftab 1.715116e+00 s Test #2918: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_svdend .......***Timeout 708.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #2919: mpi_dst_example_simple_lap_z_facto1_sched1_not_pqrcpbegin ...............***Timeout 708.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.046577e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.319140e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.863442e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.592395e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.303835e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.091486e+00 s Time to initialize coeftab 2.600084e+00 s Test #2920: mpi_dst_example_simple_lap_z_facto1_sched1_not_pqrcpend .................***Timeout 708.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.022688e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.854774e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.005235e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.173846e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.887162e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.592825e+00 s Time to initialize coeftab 4.855329e-01 s Test #2921: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpbegin ..............***Timeout 708.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.704240e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.070581e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.362460e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.306276e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.952617e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.218135e+00 s Time to initialize coeftab 3.265009e+00 s Time to factorize 1.072287e+02 s (203.48 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Test #2924: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpend .....***Timeout 708.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.749590e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.651166e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.464766e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.251095e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.244627e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Test #2926: mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrcpend .................***Timeout 707.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.445967e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.807473e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.788520e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Test #2927: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrcpbegin ..............***Timeout 707.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.617373e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.471223e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.625696e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.102651e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.965075e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.971491e+00 s Time to initialize coeftab 7.376817e+00 s Test #2928: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrcpend ................***Timeout 707.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2929: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpbegin ...***Timeout 707.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2930: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpend .....***Timeout 707.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.207908e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.698346e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.499923e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.638392e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.809464e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.686165e+00 s Time to initialize coeftab 3.483028e-01 s Test #2931: mpi_dst_example_simple_lap_z_facto1_sched1_not_tqrcpbegin ...............***Timeout 707.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2933: mpi_dst_example_simple_lap_z_facto1_sched1_kway_tqrcpbegin ..............***Timeout 707.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.003303e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.176614e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.716426e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.165916e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.937384e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.631330e+00 s Time to initialize coeftab 4.151830e+00 s Test #2934: mpi_dst_example_simple_lap_z_facto1_sched1_kway_tqrcpend ................***Timeout 707.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.388070e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.830415e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.718807e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Test #2935: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpbegin ...***Timeout 707.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.137434e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.012477e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.813126e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.807904e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.180226e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.917790e+00 s Time to initialize coeftab 6.223965e+00 s Test #2936: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpend .....***Timeout 707.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.506170e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.398253e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.770967e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.228174e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.181745e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.118045e+00 s Time to initialize coeftab 7.163573e-01 s Test #2940: mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrrtend ................***Timeout 707.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.335521e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.583945e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.410692e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Test #2942: mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtend .....***Timeout 707.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.045826e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.466979e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.697727e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.026339e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.562011e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.047282e+00 s Time to initialize coeftab 1.434997e+00 s Test #2943: mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpilu0 ...............***Timeout 707.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.117666e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.107410e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.698323e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.265824e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.500468e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.839074e+00 s Time to initialize coeftab 5.033836e-01 s Test #2945: mpi_dst_example_simple_lap_z_facto2_sched1_not_svdbegin .................***Timeout 707.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2950: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_svdend .......***Timeout 707.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.235192e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.883571e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.438121e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Test #2951: mpi_dst_example_simple_lap_z_facto2_sched1_not_pqrcpbegin ...............***Timeout 707.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.532763e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.886804e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.013461e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.600236e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.710590e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.486696e+00 s Time to initialize coeftab 3.975985e+00 s Test #2955: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpbegin ...***Timeout 706.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.306555e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.927087e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.413466e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.439846e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.545318e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.906347e+00 s Time to initialize coeftab 2.041855e+00 s Test #2956: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpend .....***Timeout 707.42 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.800484e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.857471e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.145224e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.043380e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.025361e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.910516e+00 s Time to initialize coeftab 8.220307e-01 s Time to factorize 1.562306e+01 s ( 2.56 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Test #2958: mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrcpend .................***Timeout 707.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.214580e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.115352e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.181880e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.775268e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.734199e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.008002e+00 s Time to initialize coeftab 6.270839e-01 s Time to factorize 4.383202e+01 s (933.78 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Test #2959: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrcpbegin ..............***Timeout 707.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2960: mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrcpend ................***Timeout 707.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch 1: 300 1140 2: 200 760 3: 200 660 Time to compute ordering 2.325539e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.515118e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.298932e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.062644e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.684588e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.978619e+00 s Time to initialize coeftab 9.536978e-01 s Time to factorize 5.833194e+01 s (701.66 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Test #2961: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpbegin ...***Timeout 707.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.655892e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.185952e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.500684e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.196417e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.394337e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.647787e+00 s Time to initialize coeftab 8.406937e+00 s Time to factorize 1.382561e+02 s (296.04 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Test #2962: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpend .....***Timeout 707.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.823481e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.536107e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.702455e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Test #2963: mpi_dst_example_simple_lap_z_facto2_sched1_not_tqrcpbegin ...............***Timeout 708.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2966: mpi_dst_example_simple_lap_z_facto2_sched1_kway_tqrcpend ................***Timeout 708.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.674276e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.376361e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.383425e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.269449e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.235025e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.723516e+00 s Time to initialize coeftab 7.799210e-01 s Time to factorize 7.073097e+01 s (578.66 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Test #2967: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpbegin ...***Timeout 708.09 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2968: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpend .....***Timeout 708.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.605703e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.527083e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.845552e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Test #2974: mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtend .....***Timeout 707.80 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.236704e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.057756e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.021985e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.662592e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.539512e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.437930e+00 s Time to initialize coeftab 6.879351e-01 s Time to factorize 6.146559e+01 s (665.89 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko Test #2977: mpi_dst_example_simple_lap_z_facto3_sched1_not_svdbegin .................***Timeout 708.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2978: mpi_dst_example_simple_lap_z_facto3_sched1_not_svdend ...................***Timeout 708.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2979: mpi_dst_example_simple_lap_z_facto3_sched1_kway_svdbegin ................***Timeout 708.31 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.892638e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.716305e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.553373e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.711788e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.359099e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.250237e+00 s Test #2980: mpi_dst_example_simple_lap_z_facto3_sched1_kway_svdend ..................***Timeout 708.72 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.435969e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.339567e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.766815e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.729159e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.136818e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.698027e+00 s Time to initialize coeftab 2.008023e+00 s Time to factorize 4.190735e+01 s (495.56 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Test #2981: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_svdbegin .....***Timeout 708.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.099742e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.004716e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.846022e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Test #2982: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_svdend .......***Timeout 708.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2985: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpbegin ..............***Timeout 709.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.012775e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.042363e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.031843e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.234004e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.351457e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Test #2987: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpbegin ...***Timeout 708.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2988: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpend .....***Timeout 708.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.605405e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.624272e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.014821e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.976135e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.688493e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.922044e+00 s Time to initialize coeftab 1.273250e+00 s Test #2990: mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrcpend .................***Timeout 709.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2993: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpbegin ...***Timeout 709.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #2994: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpend .....***Timeout 709.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.888277e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.524044e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.505850e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Test #2995: mpi_dst_example_simple_lap_z_facto3_sched1_not_tqrcpbegin ...............***Timeout 708.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.100913e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.053866e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.100669e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.389376e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.878046e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.957373e+00 s Test #2999: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpbegin ...***Timeout 709.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.947779e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.782334e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.108597e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.884418e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.460480e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.166235e+00 s Time to initialize coeftab 1.719026e+01 s Test #3000: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpend .....***Timeout 709.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.266975e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.311587e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.843983e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.221888e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.846470e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.982325e+00 s Time to initialize coeftab 1.072965e+00 s Time to factorize 8.902079e+01 s (233.29 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko Test #3006: mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtend .....***Timeout 709.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.084286e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.337570e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.668330e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.675282e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.577788e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.148143e+00 s Time to initialize coeftab 1.918993e+00 s Test #3007: mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpilu0 ...............***Timeout 709.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.316504e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.018127e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.690205e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.088191e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.789108e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.746915e+00 s Time to initialize coeftab 4.328004e-01 s Test #3009: mpi_dst_example_simple_lap_z_facto4_sched1_not_svdbegin .................***Timeout 709.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #3010: mpi_dst_example_simple_lap_z_facto4_sched1_not_svdend ...................***Timeout 709.97 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.006778e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.938512e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.534303e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.841629e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.109262e+01 s Test #3011: mpi_dst_example_simple_lap_z_facto4_sched1_kway_svdbegin ................***Timeout 710.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ ischedInit: The thread number has been automatically set to 256 + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.371007e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.304650e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.343618e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.134974e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.726986e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.634248e+00 s Time to initialize coeftab 5.868561e+00 s Test #3013: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_svdbegin .....***Timeout 709.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #3014: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_svdend .......***Timeout 709.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.231836e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.846686e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.817121e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.438005e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.305776e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.000269e+00 s Time to initialize coeftab 7.565145e-01 s Test #3015: mpi_dst_example_simple_lap_z_facto4_sched1_not_pqrcpbegin ...............***Timeout 709.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.732268e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.821068e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.100849e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.346178e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.542554e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.548765e+00 s Time to initialize coeftab 1.552396e+00 s Time to factorize 1.052731e+02 s (207.26 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Test #3016: mpi_dst_example_simple_lap_z_facto4_sched1_not_pqrcpend .................***Timeout 709.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.030249e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.188054e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.475518e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.862157e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.903586e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.470075e+00 s Time to initialize coeftab 4.699965e-01 s Test #3017: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpbegin ..............***Timeout 709.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.690689e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.922653e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.073442e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.438645e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.462127e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.583022e+00 s Time to initialize coeftab 2.884407e+00 s Test #3018: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpend ................***Timeout 709.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Test #3019: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpbegin ...***Timeout 709.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #3023: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrcpbegin ..............***Timeout 709.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #3025: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpbegin ...***Timeout 709.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.150517e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.203666e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.280880e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.216964e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.730981e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.223106e+00 s Time to initialize coeftab 2.644997e+00 s Time to factorize 1.819753e+02 s (119.90 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Test #3026: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpend .....***Timeout 709.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #3027: mpi_dst_example_simple_lap_z_facto4_sched1_not_tqrcpbegin ...............***Timeout 709.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #3029: mpi_dst_example_simple_lap_z_facto4_sched1_kway_tqrcpbegin ..............***Timeout 708.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.827444e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.141938e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.450561e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.099420e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.282733e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.493875e+00 s Time to initialize coeftab 5.768541e+00 s Test #3031: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpbegin ...***Timeout 708.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.597738e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.142169e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.392649e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.067390e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.709639e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Test #3033: mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrrtbegin ...............***Timeout 708.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 1: 300 1140 2: 200 760 0: 300 1140 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.145131e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.318910e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.272224e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.047087e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.765499e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.078335e+00 s Time to initialize coeftab 1.051038e+01 s Test #3035: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrrtbegin ..............***Timeout 708.55 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.112567e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.497833e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.720627e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.476102e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.112727e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.329728e-01 s Time to initialize coeftab 5.519841e+00 s Test #3036: mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrrtend ................***Timeout 708.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.689750e+01 s Test #3037: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtbegin ...***Timeout 708.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.344218e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.669627e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.799262e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.275060e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.012175e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.144617e+00 s Time to initialize coeftab 7.334061e+00 s Test #3038: mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtend .....***Timeout 708.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.914049e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.353125e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.472888e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Test #3039: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpilu0 ...............***Timeout 708.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.185659e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.678430e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.763916e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.035160e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.067077e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.997101e+00 s Test #3040: mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpilu1 ...............***Timeout 708.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Disabled PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.669140e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.565248e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.730707e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.982336e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.643131e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.668224e+00 s Time to initialize coeftab 6.060149e-01 s Test #3041: mpi_dst_example_simple_lap_s_facto0_sched4_not_svdbegin .................***Timeout 708.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.333191e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.499026e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.538070e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.382062e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.516636e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.427157e+00 s Time to initialize coeftab 4.644790e+00 s Test #3043: mpi_dst_example_simple_lap_s_facto0_sched4_kway_svdbegin ................***Timeout 708.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #3045: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_svdbegin .....***Timeout 708.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.596700e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.416393e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.449217e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.155223e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.503765e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.627412e+00 s Time to initialize coeftab 3.105960e+00 s Test #3046: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_svdend .......***Timeout 708.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.147872e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.240001e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.753168e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Test #3047: mpi_dst_example_simple_lap_s_facto0_sched4_not_pqrcpbegin ...............***Timeout 708.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.047147e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.727437e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.945578e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.560782e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.878720e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.867975e+00 s Time to initialize coeftab 1.717251e+00 s Time to factorize 2.568418e+01 s (201.83 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Test #3054: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrcpend .................***Timeout 707.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.622269e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.316700e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.850811e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.646919e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.900352e+01 s Test #3056: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrcpend ................***Timeout 707.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #3058: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpend .....***Timeout 707.31 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #3059: mpi_dst_example_simple_lap_s_facto0_sched4_not_tqrcpbegin ...............***Timeout 707.23 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.910266e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.600840e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.191723e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.854010e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.794107e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.001565e+00 s Time to initialize coeftab 1.790475e+00 s Test #3159: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpbegin ...***Timeout 707.11 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.351159e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.289258e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.096260e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.227676e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.062240e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.148291e+00 s Time to initialize coeftab 1.773119e+00 s Time to factorize 8.987571e+01 s (57.68 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 3159: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpbegin Test #3161: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrrtbegin ...............***Timeout 707.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.821581e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.054638e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.390023e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3161: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrrtbegin Test #3162: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrrtend .................***Timeout 707.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3162: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrrtend Test #3163: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrrtbegin ..............***Timeout 708.90 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.089157e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.104280e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.345558e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.693301e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.885294e+01 s Start 3163: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrrtbegin Test #3179: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpbegin ...***Timeout 701.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3179: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpbegin Test #3186: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpend .....***Timeout 690.92 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3186: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpend 3088/3626 Test #3258: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrrtend .................***Timeout 690.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.408285e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.778161e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.633257e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.173193e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.064663e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Start 3258: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrrtend 3088/3626 Test #3259: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrrtbegin ..............***Timeout 690.83 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.618384e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.560526e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.861309e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.870983e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.687373e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.969176e+00 s Time to initialize coeftab 7.523898e+00 s Start 3259: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrrtbegin 3088/3626 Test #3261: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtbegin ...***Timeout 690.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.104569e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.603547e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.750189e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.563201e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.273949e+01 s Start 3261: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtbegin 3088/3626 Test #3262: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtend .....***Timeout 691.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.881036e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.187041e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.555913e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.941565e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.542355e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Start 3262: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtend 3088/3626 Test #3264: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpilu1 ...............***Timeout 691.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.559436e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.692385e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.729975e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.283856e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.411985e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.874361e+00 s Time to initialize coeftab 8.299016e-01 s Start 3264: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpilu1 3088/3626 Test #3265: mpi_dst_example_simple_lap_c_facto1_sched4_not_svdbegin .................***Timeout 691.18 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.520173e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.145547e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.455031e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.380166e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.413229e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.224502e+00 s Time to initialize coeftab 2.835840e+00 s Start 3265: mpi_dst_example_simple_lap_c_facto1_sched4_not_svdbegin 3088/3626 Test #3266: mpi_dst_example_simple_lap_c_facto1_sched4_not_svdend ...................***Timeout 691.40 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.498807e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.971154e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.497532e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.263787e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.132018e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.577634e+00 s Time to initialize coeftab 1.096064e+00 s Start 3266: mpi_dst_example_simple_lap_c_facto1_sched4_not_svdend Test #3166: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtend .....***Timeout 691.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.471068e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.563590e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.170834e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.806860e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.246570e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.367445e+00 s Time to initialize coeftab 4.590024e-01 s Start 3166: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtend Test #3167: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpilu0 ...............***Timeout 691.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.953355e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.089753e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.399773e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.349344e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.159363e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.986013e+00 s Time to initialize coeftab 6.502288e-01 s Start 3167: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpilu0 Test #3168: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpilu1 ...............***Timeout 691.32 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3168: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpilu1 Test #3169: mpi_dst_example_simple_lap_d_facto1_sched4_not_svdbegin .................***Timeout 691.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.004633e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.095918e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.117723e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3169: mpi_dst_example_simple_lap_d_facto1_sched4_not_svdbegin Test #3170: mpi_dst_example_simple_lap_d_facto1_sched4_not_svdend ...................***Timeout 691.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.351249e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.789353e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.961857e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.117773e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.274927e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.078394e+00 s Time to initialize coeftab 1.463082e+00 s Start 3170: mpi_dst_example_simple_lap_d_facto1_sched4_not_svdend Test #3174: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_svdend .......***Timeout 691.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3174: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_svdend Test #3175: mpi_dst_example_simple_lap_d_facto1_sched4_not_pqrcpbegin ...............***Timeout 693.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.263403e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.515210e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.571111e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.837670e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.811528e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.826243e+00 s Time to initialize coeftab 1.172671e+00 s Start 3175: mpi_dst_example_simple_lap_d_facto1_sched4_not_pqrcpbegin Test #3176: mpi_dst_example_simple_lap_d_facto1_sched4_not_pqrcpend .................***Timeout 694.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3176: mpi_dst_example_simple_lap_d_facto1_sched4_not_pqrcpend Test #3177: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpbegin ..............***Timeout 694.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.329600e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.554018e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.067023e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.288764e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.020758e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.716132e+00 s Time to initialize coeftab 1.141994e+00 s Start 3177: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpbegin Test #3178: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpend ................***Timeout 694.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3178: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpend Test #3180: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpend .....***Timeout 694.97 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3180: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpend Test #3181: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrcpbegin ...............***Timeout 695.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3181: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrcpbegin Test #3182: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrcpend .................***Timeout 695.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.318904e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.989474e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.048307e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3182: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrcpend Test #3183: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrcpbegin ..............***Timeout 695.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.081909e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.409553e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.820098e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.160744e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.245708e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.958046e+00 s Time to initialize coeftab 3.088457e+00 s Start 3183: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrcpbegin 3088/3626 Test #3267: mpi_dst_example_simple_lap_c_facto1_sched4_kway_svdbegin ................***Timeout 695.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.134625e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.464675e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.121891e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.419459e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.472676e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.086271e+00 s Start 3267: mpi_dst_example_simple_lap_c_facto1_sched4_kway_svdbegin 3088/3626 Test #3268: mpi_dst_example_simple_lap_c_facto1_sched4_kway_svdend ..................***Timeout 695.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.033584e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.145242e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.138176e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.527913e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.682956e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.255433e+00 s Time to initialize coeftab 7.219806e-01 s Start 3268: mpi_dst_example_simple_lap_c_facto1_sched4_kway_svdend 3088/3626 Test #3269: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_svdbegin .....***Timeout 695.52 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3269: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_svdbegin 3088/3626 Test #3270: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_svdend .......***Timeout 695.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.752151e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.302083e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.476891e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.243896e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.505926e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.757479e+00 s Time to initialize coeftab 1.508002e+00 s Time to factorize 5.011585e+01 s (435.38 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 3270: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_svdend 3088/3626 Test #3271: mpi_dst_example_simple_lap_c_facto1_sched4_not_pqrcpbegin ...............***Timeout 695.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3271: mpi_dst_example_simple_lap_c_facto1_sched4_not_pqrcpbegin 3088/3626 Test #3272: mpi_dst_example_simple_lap_c_facto1_sched4_not_pqrcpend .................***Timeout 695.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3272: mpi_dst_example_simple_lap_c_facto1_sched4_not_pqrcpend 3088/3626 Test #3273: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpbegin ..............***Timeout 695.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.289609e+01 s Start 3273: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpbegin 3088/3626 Test #3274: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpend ................***Timeout 695.41 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3274: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpend 3088/3626 Test #3275: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpbegin ...***Timeout 695.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3275: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpbegin 3088/3626 Test #3276: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpend .....***Timeout 695.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3276: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpend 3088/3626 Test #3277: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrcpbegin ...............***Timeout 696.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.812264e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.160114e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.102262e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.021060e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.261911e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.594881e+00 s Start 3277: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrcpbegin 3088/3626 Test #3278: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrcpend .................***Timeout 696.44 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.622241e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.022837e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.492091e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3278: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrcpend 3088/3626 Test #3279: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrcpbegin ..............***Timeout 697.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3279: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrcpbegin 3088/3626 Test #3280: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrcpend ................***Timeout 698.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.293197e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.206558e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.190248e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.255566e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.471625e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.196203e+00 s Time to initialize coeftab 1.130485e+00 s Time to factorize 5.938083e+01 s (367.45 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 3280: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrcpend 3088/3626 Test #3281: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpbegin ...***Timeout 698.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.676074e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.213598e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.751420e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.594564e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.779206e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.084405e+00 s Start 3281: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpbegin 3088/3626 Test #3282: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpend .....***Timeout 700.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3282: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpend 3088/3626 Test #3283: mpi_dst_example_simple_lap_c_facto1_sched4_not_tqrcpbegin ...............***Timeout 700.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.257889e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.242257e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.531967e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.360679e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.937002e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Start 3283: mpi_dst_example_simple_lap_c_facto1_sched4_not_tqrcpbegin 3088/3626 Test #3284: mpi_dst_example_simple_lap_c_facto1_sched4_not_tqrcpend .................***Timeout 701.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3284: mpi_dst_example_simple_lap_c_facto1_sched4_not_tqrcpend 3088/3626 Test #3285: mpi_dst_example_simple_lap_c_facto1_sched4_kway_tqrcpbegin ..............***Timeout 702.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.948516e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.414495e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.014695e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.180158e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.578708e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.405432e+00 s Time to initialize coeftab 5.791890e+00 s Start 3285: mpi_dst_example_simple_lap_c_facto1_sched4_kway_tqrcpbegin 3088/3626 Test #3286: mpi_dst_example_simple_lap_c_facto1_sched4_kway_tqrcpend ................***Timeout 702.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) ischedInit: The thread number has been automatically set to 256 Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.278830e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.800023e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.462020e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.382290e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.981947e+01 s Start 3286: mpi_dst_example_simple_lap_c_facto1_sched4_kway_tqrcpend 3088/3626 Test #3287: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpbegin ...***Timeout 703.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.744884e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.907530e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.232277e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.263034e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.035840e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.711023e+00 s Time to initialize coeftab 3.585086e+00 s Start 3287: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpbegin 3088/3626 Test #3288: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpend .....***Timeout 703.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.855738e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.111832e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.313055e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.075160e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.312786e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.334034e+00 s Time to initialize coeftab 1.092457e+00 s Time to factorize 8.074803e+01 s (270.22 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 3288: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpend 3088/3626 Test #3289: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrrtbegin ...............***Timeout 703.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.280826e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.545194e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.703913e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.099155e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.872770e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Start 3289: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrrtbegin 3088/3626 Test #3290: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrrtend .................***Timeout 704.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.762542e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.598665e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.653420e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.150918e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.940563e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.161242e+00 s Time to initialize coeftab 6.608372e-01 s Time to factorize 9.015395e+01 s (242.02 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 3290: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrrtend 3088/3626 Test #3291: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrrtbegin ..............***Timeout 704.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.769032e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.293513e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.080786e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.807259e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.941857e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.567367e+00 s Time to initialize coeftab 7.056104e+00 s Start 3291: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrrtbegin 3088/3626 Test #3292: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrrtend ................***Timeout 703.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.452741e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.536924e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.059473e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.107512e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.936115e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.301661e+00 s Time to initialize coeftab 4.690230e-01 s Start 3292: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrrtend 3088/3626 Test #3293: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtbegin ...***Timeout 704.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3293: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtbegin 3088/3626 Test #3294: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtend .....***Timeout 704.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.628257e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.569015e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.115926e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.309442e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.559288e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.892798e+00 s Time to initialize coeftab 4.405774e+00 s Time to factorize 4.285673e+01 s (509.12 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3294: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtend 3088/3626 Test #3295: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpilu0 ...............***Timeout 704.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.124565e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.538085e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.795044e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.222215e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.914261e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.360135e+00 s Time to initialize coeftab 1.134963e+00 s Time to factorize 9.799750e+01 s (222.65 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 3295: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpilu0 3088/3626 Test #3296: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpilu1 ...............***Timeout 704.69 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.021989e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.482735e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.323262e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.648486e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.550600e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.387529e-01 s Time to initialize coeftab 2.187891e-01 s Start 3296: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpilu1 3088/3626 Test #3297: mpi_dst_example_simple_lap_c_facto2_sched4_not_svdbegin .................***Timeout 704.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.731204e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.256358e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.119060e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.514908e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.739899e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.985895e+00 s Time to initialize coeftab 4.716820e+00 s Start 3297: mpi_dst_example_simple_lap_c_facto2_sched4_not_svdbegin 3088/3626 Test #3299: mpi_dst_example_simple_lap_c_facto2_sched4_kway_svdbegin ................***Timeout 705.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.828478e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.282091e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.457913e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.634959e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.159189e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.776827e+00 s Time to initialize coeftab 9.686404e+00 s Start 3299: mpi_dst_example_simple_lap_c_facto2_sched4_kway_svdbegin 3088/3626 Test #3300: mpi_dst_example_simple_lap_c_facto2_sched4_kway_svdend ..................***Timeout 705.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.081457e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.310328e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.041501e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.386171e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.390300e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.590502e+00 s Time to initialize coeftab 1.293002e+00 s Start 3300: mpi_dst_example_simple_lap_c_facto2_sched4_kway_svdend 3088/3626 Test #3301: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_svdbegin .....***Timeout 706.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.647981e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.450759e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.235260e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.994800e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.819644e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.428664e+00 s Time to initialize coeftab 5.279490e+00 s Start 3301: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_svdbegin 3088/3626 Test #3302: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_svdend .......***Timeout 706.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3302: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_svdend 3088/3626 Test #3303: mpi_dst_example_simple_lap_c_facto2_sched4_not_pqrcpbegin ...............***Timeout 706.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3303: mpi_dst_example_simple_lap_c_facto2_sched4_not_pqrcpbegin 3088/3626 Test #3305: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpbegin ..............***Timeout 706.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.100316e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.067644e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.757730e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.015945e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.203782e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.395744e+00 s Time to initialize coeftab 3.524110e+00 s Start 3305: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpbegin 3088/3626 Test #3325: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtbegin ...***Timeout 707.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.103321e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.811882e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.732370e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.221861e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.944786e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.777169e+00 s Time to initialize coeftab 1.696509e+01 s Time to factorize 1.563761e+02 s (261.74 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 3325: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtbegin 3088/3626 Test #3326: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtend .....***Timeout 707.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3326: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtend 3088/3626 Test #3329: mpi_dst_example_simple_lap_c_facto3_sched4_not_svdbegin .................***Timeout 707.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.240201e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.108970e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.629304e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.368464e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.391611e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.198967e+00 s Time to initialize coeftab 3.040177e+00 s Start 3329: mpi_dst_example_simple_lap_c_facto3_sched4_not_svdbegin 3088/3626 Test #3331: mpi_dst_example_simple_lap_c_facto3_sched4_kway_svdbegin ................***Timeout 708.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.532123e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.514747e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.015148e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.121467e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.879778e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.235244e+00 s Time to initialize coeftab 2.264751e+00 s Time to factorize 2.046741e+02 s (101.47 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 3331: mpi_dst_example_simple_lap_c_facto3_sched4_kway_svdbegin 3088/3626 Test #3332: mpi_dst_example_simple_lap_c_facto3_sched4_kway_svdend ..................***Timeout 708.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch 1: 300 1140 2: 200 760 3: 200 660 Time to compute ordering 2.144548e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.546248e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.609800e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.395740e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.737890e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.146612e+00 s Time to initialize coeftab 8.861069e-01 s Start 3332: mpi_dst_example_simple_lap_c_facto3_sched4_kway_svdend 3088/3626 Test #3333: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_svdbegin .....***Timeout 710.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.558456e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.107821e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.019943e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.144642e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.862722e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.565173e-01 s Time to initialize coeftab 1.230515e+00 s Start 3333: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_svdbegin 3088/3626 Test #3343: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrcpbegin ..............***Timeout 708.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3343: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrcpbegin 3088/3626 Test #3345: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpbegin ...***Timeout 709.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.538055e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.466823e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.137957e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.238049e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.571567e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.914470e+00 s Start 3345: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpbegin 3088/3626 Test #3349: mpi_dst_example_simple_lap_c_facto3_sched4_kway_tqrcpbegin ..............***Timeout 709.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.517077e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.115905e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.361877e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.793907e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.330462e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.569539e+00 s Time to initialize coeftab 6.939931e+00 s Start 3349: mpi_dst_example_simple_lap_c_facto3_sched4_kway_tqrcpbegin 3088/3626 Test #3351: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpbegin ...***Timeout 710.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.217013e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.382231e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.775720e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.165473e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.556470e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.001299e+00 s Time to initialize coeftab 7.041318e+00 s Start 3351: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpbegin 3088/3626 Test #3361: mpi_dst_example_simple_lap_c_facto4_sched4_not_svdbegin .................***Timeout 707.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.955033e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.552868e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.621924e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.298804e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.196209e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 9.990551e-01 s Time to initialize coeftab 3.124285e+00 s Start 3361: mpi_dst_example_simple_lap_c_facto4_sched4_not_svdbegin 3088/3626 Test #3362: mpi_dst_example_simple_lap_c_facto4_sched4_not_svdend ...................***Timeout 705.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.612135e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.848701e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.218533e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.163323e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.981515e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.296065e-01 s Time to initialize coeftab 1.695395e-01 s Start 3362: mpi_dst_example_simple_lap_c_facto4_sched4_not_svdend 3088/3626 Test #3363: mpi_dst_example_simple_lap_c_facto4_sched4_kway_svdbegin ................***Timeout 705.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.191400e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.954355e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.600921e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.101933e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.444764e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.929259e+00 s Time to initialize coeftab 6.178047e+00 s Start 3363: mpi_dst_example_simple_lap_c_facto4_sched4_kway_svdbegin 3088/3626 Test #3365: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_svdbegin .....***Timeout 702.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.297384e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.459142e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.785723e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.172766e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.855397e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.046889e+00 s Time to initialize coeftab 6.965632e+00 s Start 3365: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_svdbegin 3088/3626 Test #3367: mpi_dst_example_simple_lap_c_facto4_sched4_not_pqrcpbegin ...............***Timeout 703.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.981599e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.782259e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.631033e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.795722e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.692008e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.783514e+00 s Time to initialize coeftab 1.844132e+00 s Time to factorize 1.096857e+02 s (198.93 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 3367: mpi_dst_example_simple_lap_c_facto4_sched4_not_pqrcpbegin 3088/3626 Test #3369: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpbegin ..............***Timeout 703.72 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.302357e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.793909e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.861446e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.764450e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.782668e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.401100e-01 s Time to initialize coeftab 2.341107e+00 s Time to factorize 1.149940e+02 s (189.74 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.844730e+00 s - iteration 1 : total iteration time 7.22 s error 5.9158e-11 Time for refinement 1.020494e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.264065e-08 max(|| b_i - A x_i ||_1) 3.162116e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.979118e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.264065e-08 max(|| b_i - A x_i ||_1) 3.162116e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.979118e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.264065e-08 max(|| b_i - A x_i ||_1) 3.162116e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.979118e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.264065e-08 max(|| b_i - A x_i ||_1) 3.162116e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.979118e-01 (SUCCESS) Start 3369: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpbegin 3088/3626 Test #3371: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpbegin ...***Timeout 704.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.143568e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.534637e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.219906e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.572889e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.671358e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.694800e-01 s Time to initialize coeftab 4.817349e+00 s Start 3371: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpbegin 3088/3626 Test #3373: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrcpbegin ...............***Timeout 705.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.090755e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.420437e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.960487e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.575770e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.421485e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.235056e+00 s Time to initialize coeftab 3.210888e+00 s Start 3373: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrcpbegin 3088/3626 Test #3374: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrcpend .................***Timeout 707.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.444713e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.382161e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.976018e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.775183e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.396588e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.637291e+00 s Time to initialize coeftab 3.301179e+00 s Start 3374: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrcpend 3088/3626 Test #3375: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrcpbegin ..............***Timeout 707.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.337826e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.882971e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.618918e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.802194e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.877889e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.122967e+00 s Time to initialize coeftab 5.675952e+00 s Start 3375: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrcpbegin 3088/3626 Test #3377: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpbegin ...***Timeout 707.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.381216e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.150059e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.610581e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.553479e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.908099e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.522975e+00 s Time to initialize coeftab 6.298661e+00 s Start 3377: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpbegin 3088/3626 Test #3379: mpi_dst_example_simple_lap_c_facto4_sched4_not_tqrcpbegin ...............***Timeout 706.11 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.649433e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.446974e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.134097e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.478368e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.612863e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.609973e-01 s Time to initialize coeftab 3.379717e+00 s Time to factorize 1.333613e+02 s (163.61 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 3379: mpi_dst_example_simple_lap_c_facto4_sched4_not_tqrcpbegin 3088/3626 Test #3381: mpi_dst_example_simple_lap_c_facto4_sched4_kway_tqrcpbegin ..............***Timeout 706.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.626837e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.155456e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.486873e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.617286e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.062836e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.434669e+00 s Time to initialize coeftab 3.087781e+00 s Start 3381: mpi_dst_example_simple_lap_c_facto4_sched4_kway_tqrcpbegin 3088/3626 Test #3383: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpbegin ...***Timeout 705.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.714990e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.957025e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.464157e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.951001e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.181219e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.686381e+00 s Time to initialize coeftab 6.663279e+00 s Time to factorize 1.217267e+02 s (179.25 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3383: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpbegin 3088/3626 Test #3385: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrrtbegin ...............***Timeout 706.45 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.018119e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.248991e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.512870e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.770426e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.878430e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.007361e+00 s Time to initialize coeftab 6.185174e+00 s Start 3385: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrrtbegin 3088/3626 Test #3387: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrrtbegin ..............***Timeout 707.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.624370e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.809431e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.764306e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.705004e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.058596e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 9.563811e-01 s Time to initialize coeftab 3.740506e+00 s Time to factorize 1.357904e+02 s (160.68 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3387: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrrtbegin 3088/3626 Test #3389: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtbegin ...***Timeout 707.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.429748e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.726987e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.668483e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.301411e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.003481e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.343935e+00 s Time to initialize coeftab 6.092004e+00 s Start 3389: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtbegin 3088/3626 Test #3392: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpilu1 ...............***Timeout 706.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.756316e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.341629e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.972859e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.495124e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.530429e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.292128e+00 s Time to initialize coeftab 6.919665e-01 s Time to factorize 1.090620e+02 s (200.06 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.014690e+00 s Start 3392: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpilu1 3088/3626 Test #3393: mpi_dst_example_simple_lap_z_facto0_sched4_not_svdbegin .................***Timeout 707.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.085036e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.537598e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.166431e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.857339e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.115309e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.681260e+00 s Time to initialize coeftab 4.275701e+00 s Start 3393: mpi_dst_example_simple_lap_z_facto0_sched4_not_svdbegin 3088/3626 Test #3394: mpi_dst_example_simple_lap_z_facto0_sched4_not_svdend ...................***Timeout 707.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.795705e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.299724e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.477361e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.361399e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.486463e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.245085e+00 s Time to initialize coeftab 1.127837e+00 s Start 3394: mpi_dst_example_simple_lap_z_facto0_sched4_not_svdend 3088/3626 Test #3395: mpi_dst_example_simple_lap_z_facto0_sched4_kway_svdbegin ................***Timeout 707.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.345594e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.729729e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.546708e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.408358e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.931174e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.096307e-01 s Time to initialize coeftab 4.044596e+00 s Start 3395: mpi_dst_example_simple_lap_z_facto0_sched4_kway_svdbegin 3088/3626 Test #3396: mpi_dst_example_simple_lap_z_facto0_sched4_kway_svdend ..................***Timeout 707.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.383290e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.689032e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.109768e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.834640e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.116196e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.960648e+00 s Time to initialize coeftab 5.506473e-01 s Start 3396: mpi_dst_example_simple_lap_z_facto0_sched4_kway_svdend 3088/3626 Test #3397: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_svdbegin .....***Timeout 708.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 3: 200 660 2: 200 760 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.244333e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.244285e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.706583e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.869048e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.061011e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.746071e+00 s Time to initialize coeftab 4.831810e+00 s Start 3397: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_svdbegin 3088/3626 Test #3399: mpi_dst_example_simple_lap_z_facto0_sched4_not_pqrcpbegin ...............***Timeout 710.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 2: 200 760 1: 300 1140 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.361079e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.109301e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.698534e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.649200e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.722932e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.411353e-01 s Time to initialize coeftab 9.089866e-01 s Start 3399: mpi_dst_example_simple_lap_z_facto0_sched4_not_pqrcpbegin 3088/3626 Test #3401: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpbegin ..............***Timeout 710.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.866227e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.550909e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.527116e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.147075e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.614750e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.870252e+00 s Time to initialize coeftab 1.662532e+00 s Start 3401: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpbegin 3088/3626 Test #3403: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpbegin ...***Timeout 708.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 2: 200 760 1: 300 1140 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.973347e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.934009e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.053730e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.521136e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.633448e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.092741e+00 s Time to initialize coeftab 1.580566e+00 s Time to factorize 1.116257e+02 s (186.05 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Start 3403: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpbegin 3088/3626 Test #3405: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrcpbegin ...............***Timeout 708.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.346223e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.970689e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.127566e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.618417e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.673433e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.075170e+00 s Time to initialize coeftab 3.319997e+00 s Start 3405: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrcpbegin 3088/3626 Test #3407: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrcpbegin ..............***Timeout 706.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.001457e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.746289e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.097818e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.796075e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.447766e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.062704e-01 s Time to initialize coeftab 5.741875e+00 s Start 3407: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrcpbegin 3088/3626 Test #3409: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpbegin ...***Timeout 704.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.539411e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.428320e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.853253e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.663678e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.673620e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.516289e+00 s Time to initialize coeftab 6.541070e+00 s Start 3409: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpbegin 3088/3626 Test #3411: mpi_dst_example_simple_lap_z_facto0_sched4_not_tqrcpbegin ...............***Timeout 696.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3411: mpi_dst_example_simple_lap_z_facto0_sched4_not_tqrcpbegin 3088/3626 Test #3412: mpi_dst_example_simple_lap_z_facto0_sched4_not_tqrcpend .................***Timeout 689.97 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.165083e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.510922e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.783528e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.675348e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.364629e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Start 3412: mpi_dst_example_simple_lap_z_facto0_sched4_not_tqrcpend 3088/3626 Test #3413: mpi_dst_example_simple_lap_z_facto0_sched4_kway_tqrcpbegin ..............***Timeout 688.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.023678e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.645497e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.881141e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.946943e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.766753e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.618473e+00 s Time to initialize coeftab 7.086666e+00 s Time to factorize 1.043678e+02 s (198.99 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 3413: mpi_dst_example_simple_lap_z_facto0_sched4_kway_tqrcpbegin Test #3093: mpi_dst_example_simple_lap_s_facto1_sched4_kway_tqrcpbegin ..............***Timeout 648.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.169438e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.760597e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.735434e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.450148e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.715173e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.067321e+00 s Time to initialize coeftab 2.650340e+00 s Time to factorize 4.085349e+01 s (131.18 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Test #3131: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrrtbegin ..............***Timeout 637.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #3210: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpend ................***Timeout 612.57 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3210: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpend Test #3217: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpbegin ...***Timeout 609.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3217: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpbegin Test #3252: mpi_dst_example_simple_lap_c_facto0_sched4_not_tqrcpend .................***Timeout 382.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3252: mpi_dst_example_simple_lap_c_facto0_sched4_not_tqrcpend 3090/3626 Test #3414: mpi_dst_example_simple_lap_z_facto0_sched4_kway_tqrcpend ................***Timeout 379.16 sec Start 3414: mpi_dst_example_simple_lap_z_facto0_sched4_kway_tqrcpend 3090/3626 Test #3415: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpbegin ...***Timeout 380.17 sec Start 3415: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpbegin 3090/3626 Test #3416: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpend .....***Timeout 380.85 sec Start 3416: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpend 3090/3626 Test #3417: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrrtbegin ...............***Timeout 381.86 sec Start 3417: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrrtbegin 3090/3626 Test #3418: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrrtend .................***Timeout 382.76 sec Start 3418: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrrtend 3090/3626 Test #3419: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrrtbegin ..............***Timeout 383.06 sec Start 3419: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrrtbegin 3090/3626 Test #3420: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrrtend ................***Timeout 383.75 sec Start 3420: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrrtend 3090/3626 Test #3421: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtbegin ...***Timeout 384.16 sec Start 3421: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtbegin 3090/3626 Test #3422: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtend .....***Timeout 384.48 sec Start 3422: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtend 3090/3626 Test #3423: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpilu0 ...............***Timeout 385.31 sec Start 3423: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpilu0 3090/3626 Test #3424: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpilu1 ...............***Timeout 385.72 sec Start 3424: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpilu1 3090/3626 Test #3425: mpi_dst_example_simple_lap_z_facto1_sched4_not_svdbegin .................***Timeout 385.99 sec Start 3425: mpi_dst_example_simple_lap_z_facto1_sched4_not_svdbegin 3090/3626 Test #3426: mpi_dst_example_simple_lap_z_facto1_sched4_not_svdend ...................***Timeout 386.50 sec Start 3426: mpi_dst_example_simple_lap_z_facto1_sched4_not_svdend 3090/3626 Test #3427: mpi_dst_example_simple_lap_z_facto1_sched4_kway_svdbegin ................***Timeout 387.45 sec Start 3427: mpi_dst_example_simple_lap_z_facto1_sched4_kway_svdbegin 3090/3626 Test #3428: mpi_dst_example_simple_lap_z_facto1_sched4_kway_svdend ..................***Timeout 388.45 sec Start 3428: mpi_dst_example_simple_lap_z_facto1_sched4_kway_svdend 3090/3626 Test #3429: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_svdbegin .....***Timeout 389.15 sec Start 3429: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_svdbegin 3090/3626 Test #3430: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_svdend .......***Timeout 390.13 sec Start 3430: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_svdend 3090/3626 Test #3431: mpi_dst_example_simple_lap_z_facto1_sched4_not_pqrcpbegin ...............***Timeout 391.10 sec Start 3431: mpi_dst_example_simple_lap_z_facto1_sched4_not_pqrcpbegin 3090/3626 Test #3432: mpi_dst_example_simple_lap_z_facto1_sched4_not_pqrcpend .................***Timeout 391.59 sec Start 3432: mpi_dst_example_simple_lap_z_facto1_sched4_not_pqrcpend 3090/3626 Test #3433: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpbegin ..............***Timeout 391.58 sec Start 3433: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpbegin 3090/3626 Test #3434: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpend ................***Timeout 391.42 sec Start 3434: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpend 3090/3626 Test #3435: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpbegin ...***Timeout 391.43 sec Start 3435: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpbegin 3090/3626 Test #3436: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpend .....***Timeout 391.45 sec Start 3436: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpend 3090/3626 Test #3437: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrcpbegin ...............***Timeout 391.72 sec Start 3437: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrcpbegin 3090/3626 Test #3438: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrcpend .................***Timeout 391.44 sec Start 3438: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrcpend 3090/3626 Test #3439: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrcpbegin ..............***Timeout 391.29 sec Start 3439: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrcpbegin 3090/3626 Test #3440: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrcpend ................***Timeout 391.53 sec Start 3440: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrcpend 3090/3626 Test #3441: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpbegin ...***Timeout 392.03 sec Start 3441: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpbegin Test #3065: mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrrtbegin ...............***Timeout 392.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.862885e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.271794e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.056581e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.112779e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.037031e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.496346e+00 s Time to initialize coeftab 4.191447e+00 s Time to factorize 5.716508e+01 s (90.68 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.886957e+00 s - iteration 1 : total iteration time 41.9 s error 7.2081e-11 Time for refinement 5.775775e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.915758e-08 max(|| b_i - A x_i ||_1) 2.917251e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.665789e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.915758e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.915758e-08 max(|| b_i - A x_i ||_1) 2.917251e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.665789e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.917251e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.665789e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.915758e-08 max(|| b_i - A x_i ||_1) 2.917251e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.665789e-01 (SUCCESS) Test #3067: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrrtbegin ..............***Timeout 392.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.500834e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.967858e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.336496e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.589048e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.917773e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.849602e+00 s Time to initialize coeftab 3.319440e+00 s Time to factorize 5.066108e+01 s (102.32 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.595014e+01 s - iteration 1 : total iteration time 19.6 s error 5.1253e-11 Time for refinement 3.361282e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.090514e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.090514e-08 max(|| b_i - A x_i ||_1) 2.967579e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.729031e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.967579e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.729031e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.090514e-08 max(|| b_i - A x_i ||_1) 2.967579e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.729031e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.090514e-08 max(|| b_i - A x_i ||_1) 2.967579e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.729031e-01 (SUCCESS) Test #3068: mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrrtend ................***Timeout 392.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.710225e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.533828e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.164629e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.814933e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.570807e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.642610e+00 s Time to initialize coeftab 6.715895e-01 s Time to factorize 4.330225e+01 s (119.71 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 4.833805e+00 s Time for refinement 2.386763e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.902192e-07 max(|| b_i - A x_i ||_1) 8.495661e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.067557e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.902192e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.902192e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.902192e-07 max(|| b_i - A x_i ||_1) 8.495661e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.067557e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.495661e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.067557e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.495661e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.067557e+00 (SUCCESS) Test #3069: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtbegin ...***Timeout 392.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.276344e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.809779e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.364715e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.983450e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.457647e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.866385e+00 s Time to initialize coeftab 4.437052e+00 s Time to factorize 8.417679e+01 s (61.58 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 3.117488e+01 s - iteration 1 : total iteration time 38.7 s error 5.2567e-11 Time for refinement 5.503012e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.003701e-08 max(|| b_i - A x_i ||_1) 2.983182e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.748638e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.003701e-08 max(|| b_i - A x_i ||_1) 2.983182e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.748638e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.003701e-08 max(|| b_i - A x_i ||_1) 2.983182e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.748638e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.003701e-08 max(|| b_i - A x_i ||_1) 2.983182e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.748638e-01 (SUCCESS) Test #3070: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtend .....***Timeout 392.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.260058e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.447376e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.895689e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.607306e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.808736e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.521220e+00 s Time to initialize coeftab 1.337023e+00 s Time to factorize 3.433605e+01 s (150.97 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.932966e+01 s Time for refinement 5.851995e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996107e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.757438e-07 max(|| b_i - A x_i ||_1) 9.819997e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.233971e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.757438e-07 max(|| b_i - A x_i ||_1) 9.819997e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.233971e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.757438e-07 max(|| b_i - A x_i ||_1) 9.819997e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.233971e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.757438e-07 max(|| b_i - A x_i ||_1) 9.819997e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.233971e+00 (SUCCESS) Test #3071: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpilu0 ...............***Timeout 392.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.422797e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.602002e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.336885e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.666821e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.239814e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.339863e-01 s Time to initialize coeftab 1.555117e+00 s Time to factorize 5.060802e+01 s (102.43 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.774194e+01 s - iteration 1 : total iteration time 11.1 s error 2.5436e-11 Time for refinement 3.143172e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.242116e-08 max(|| b_i - A x_i ||_1) 3.028273e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.805299e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.242116e-08 max(|| b_i - A x_i ||_1) 3.028273e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.805299e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.242116e-08 max(|| b_i - A x_i ||_1) 3.028273e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.805299e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.242116e-08 max(|| b_i - A x_i ||_1) 3.028273e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.805299e-01 (SUCCESS) Test #3072: mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpilu1 ...............***Timeout 392.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.596757e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.188447e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.657908e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.462169e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.711360e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.984177e+00 s Time to initialize coeftab 3.696731e+00 s Time to factorize 6.505361e+01 s (79.68 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Test #3081: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpbegin ..............***Timeout 414.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.094916e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.209105e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.977869e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.452389e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.354408e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.676881e+00 s Time to initialize coeftab 1.095880e+00 s Time to factorize 5.203433e+01 s (102.99 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Test #3082: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpend ................***Timeout 414.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ ischedInit: The thread number has been automatically set to 256 + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.068798e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.598691e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.810429e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.408439e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.708329e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.827574e+00 s Time to initialize coeftab 4.892527e-01 s Time to factorize 2.748410e+01 s (194.99 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 8.377000e+00 s Time for refinement 1.461470e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.811922e-07 max(|| b_i - A x_i ||_1) 7.699352e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.674932e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.811922e-07 max(|| b_i - A x_i ||_1) 7.699352e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.674932e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.811922e-07 max(|| b_i - A x_i ||_1) 7.699352e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.674932e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.811922e-07 max(|| b_i - A x_i ||_1) 7.699352e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.674932e-01 (SUCCESS) Test #3083: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpbegin ...***Timeout 414.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.162148e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.801615e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.941218e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.813895e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.841598e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.945117e+00 s Time to initialize coeftab 2.620846e+00 s Time to factorize 9.754772e+01 s (54.94 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.050354e+01 s - iteration 1 : total iteration time 25.7 s error 6.1345e-11 Time for refinement 4.773096e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.766780e-08 max(|| b_i - A x_i ||_1) 2.811379e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.532752e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.766780e-08 max(|| b_i - A x_i ||_1) 2.811379e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.532752e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.766780e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.766780e-08 max(|| b_i - A x_i ||_1) 2.811379e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.532752e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.811379e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.532752e-01 (SUCCESS) Test #3088: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrcpend ................***Timeout 414.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.163328e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.889280e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.192972e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.948977e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.366892e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.357336e+00 s Time to initialize coeftab 9.707987e-01 s Time to factorize 5.953750e+01 s (90.01 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 5.569116e+00 s Time for refinement 2.291686e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.631186e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.631186e-07 max(|| b_i - A x_i ||_1) 1.078087e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.354713e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.078087e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.354713e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.631186e-07 max(|| b_i - A x_i ||_1) 1.078087e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.354713e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.631186e-07 max(|| b_i - A x_i ||_1) 1.078087e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.354713e+00 (SUCCESS) Test #3090: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpend .....***Timeout 414.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.207391e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.120594e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.314615e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.060690e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.337157e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.201493e+00 s Time to initialize coeftab 9.557293e-01 s Time to factorize 5.151264e+01 s (104.03 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.654393e+01 s Time for refinement 7.480159e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.729732e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.729732e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.729732e-07 max(|| b_i - A x_i ||_1) 9.028464e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.134508e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.028464e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.134508e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.028464e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.134508e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.729732e-07 max(|| b_i - A x_i ||_1) 9.028464e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.134508e+00 (SUCCESS) Test #3092: mpi_dst_example_simple_lap_s_facto1_sched4_not_tqrcpend .................***Timeout 414.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.135140e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.251563e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.003655e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.829976e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.347149e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.327944e+00 s Time to initialize coeftab 8.164814e-01 s Time to factorize 6.023875e+01 s (88.96 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Test #3094: mpi_dst_example_simple_lap_s_facto1_sched4_kway_tqrcpend ................***Timeout 414.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.044003e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.072281e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.271222e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.901395e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.554155e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.578769e+00 s Time to initialize coeftab 2.073870e+00 s Time to factorize 5.239410e+01 s (102.28 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.085817e+01 s Time for refinement 4.680103e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.703065e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.703065e-07 max(|| b_i - A x_i ||_1) 8.834576e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.110144e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.834576e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.110144e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.703065e-07 max(|| b_i - A x_i ||_1) 8.834576e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.110144e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.703065e-07 max(|| b_i - A x_i ||_1) 8.834576e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.110144e+00 (SUCCESS) Test #3096: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpend .....***Timeout 414.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.290455e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.078390e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.377729e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.016214e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.005273e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.555274e+00 s Time to initialize coeftab 1.625088e+00 s Time to factorize 8.018849e+01 s (66.83 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.569086e+01 s Time for refinement 6.965197e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.670541e-07 max(|| b_i - A x_i ||_1) 1.087052e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.365979e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.670541e-07 max(|| b_i - A x_i ||_1) 1.087052e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.365979e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.670541e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.670541e-07 max(|| b_i - A x_i ||_1) 1.087052e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.365979e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.087052e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.365979e+00 (SUCCESS) Test #3100: mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrrtend ................***Timeout 413.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.884700e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.793659e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.468823e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.421442e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.055043e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.334034e+00 s Time to initialize coeftab 6.922680e-01 s Time to factorize 3.499507e+01 s (153.14 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 2.862201e+01 s Time for refinement 1.081098e+01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.773413e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.773413e-07 max(|| b_i - A x_i ||_1) 1.473875e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.852057e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.473875e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.852057e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.773413e-07 max(|| b_i - A x_i ||_1) 1.473875e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.852057e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.773413e-07 max(|| b_i - A x_i ||_1) 1.473875e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.852057e+00 (SUCCESS) Test #3102: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtend .....***Timeout 413.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.468766e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.104892e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.330534e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.500926e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.724551e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.960902e+00 s Time to initialize coeftab 1.906999e+00 s Time to factorize 4.586905e+01 s (116.83 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.664585e+01 s Time for refinement 3.743273e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.163583e-07 max(|| b_i - A x_i ||_1) 8.903838e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.118848e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.163583e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.163583e-07 max(|| b_i - A x_i ||_1) 8.903838e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.118848e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.903838e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.118848e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.163583e-07 max(|| b_i - A x_i ||_1) 8.903838e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.118848e+00 (SUCCESS) Test #3104: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpilu1 ...............***Timeout 420.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.055587e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.195887e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.307113e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.520731e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.758241e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.284518e+00 s Time to initialize coeftab 1.022078e+00 s Time to factorize 9.359187e+01 s (57.26 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Test #3105: mpi_dst_example_simple_lap_s_facto2_sched4_not_svdbegin .................***Timeout 419.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.413831e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.794787e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.341172e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.112342e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.587377e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.595437e+00 s Time to initialize coeftab 5.195042e+00 s Time to factorize 8.502016e+01 s (120.26 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 3.789154e+01 s Time for refinement 1.198171e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.941568e-07 max(|| b_i - A x_i ||_1) 8.378327e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.052813e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.941568e-07 max(|| b_i - A x_i ||_1) 8.378327e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.052813e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.941568e-07 max(|| b_i - A x_i ||_1) 8.378327e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.052813e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.941568e-07 max(|| b_i - A x_i ||_1) 8.378327e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.052813e+00 (SUCCESS) Test #3107: mpi_dst_example_simple_lap_s_facto2_sched4_kway_svdbegin ................***Timeout 419.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.307966e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.122743e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.220180e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.255469e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.208102e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.751042e+00 s Time to initialize coeftab 4.348252e+00 s Time to factorize 8.915074e+01 s (114.68 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.014738e+01 s Time for refinement 6.329078e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.953989e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.953989e-07 max(|| b_i - A x_i ||_1) 8.478503e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.065401e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.478503e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.065401e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.953989e-07 max(|| b_i - A x_i ||_1) 8.478503e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.065401e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.953989e-07 max(|| b_i - A x_i ||_1) 8.478503e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.065401e+00 (SUCCESS) Test #3108: mpi_dst_example_simple_lap_s_facto2_sched4_kway_svdend ..................***Timeout 419.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.181501e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.064533e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.853109e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.172321e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.777758e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.629844e+00 s Time to initialize coeftab 1.217057e+00 s Time to factorize 4.105899e+01 s (249.01 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 7.387586e+00 s Time for refinement 6.844278e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702243e-07 max(|| b_i - A x_i ||_1) 7.436456e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.344579e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702243e-07 max(|| b_i - A x_i ||_1) 7.436456e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.344579e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702243e-07 max(|| b_i - A x_i ||_1) 7.436456e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.344579e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.702243e-07 max(|| b_i - A x_i ||_1) 7.436456e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.344579e-01 (SUCCESS) Test #3112: mpi_dst_example_simple_lap_s_facto2_sched4_not_pqrcpend .................***Timeout 419.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.306622e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.556817e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.100948e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.418226e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.158328e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.765247e+00 s Time to initialize coeftab 1.149574e+00 s Time to factorize 3.819520e+01 s (267.68 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.095769e+01 s - iteration 1 : total iteration time 16 s error 1.5256e-12 Time for refinement 2.695338e+01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.802468e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.802468e-08 max(|| b_i - A x_i ||_1) 2.794928e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.512079e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.802468e-08 max(|| b_i - A x_i ||_1) 2.794928e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.512079e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.802468e-08 max(|| b_i - A x_i ||_1) 2.794928e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.512079e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.794928e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.512079e-01 (SUCCESS) Test #3115: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpbegin ...***Timeout 419.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.816426e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.292182e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.359538e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.926204e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.634539e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.729608e-02 s Time to initialize coeftab 2.375852e+00 s Time to factorize 2.949056e+01 s (346.69 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 6.709322e+00 s - iteration 1 : total iteration time 42.5 s error 9.0209e-11 Time for refinement 6.685459e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.990965e-08 max(|| b_i - A x_i ||_1) 2.931712e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683961e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.990965e-08 max(|| b_i - A x_i ||_1) 2.931712e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683961e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.990965e-08 max(|| b_i - A x_i ||_1) 2.931712e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683961e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.990965e-08 max(|| b_i - A x_i ||_1) 2.931712e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.683961e-01 (SUCCESS) Test #3116: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpend .....***Timeout 419.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.713053e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.420745e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.792300e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.384797e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.280422e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.445522e+00 s Time to initialize coeftab 1.797119e+00 s Time to factorize 1.919425e+01 s (532.67 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 6.164219e+00 s - iteration 1 : total iteration time 39.3 s error 2.6959e-12 Time for refinement 5.642479e+01 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.630608e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.630608e-08 max(|| b_i - A x_i ||_1) 2.756527e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.463826e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.756527e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.463826e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.630608e-08 max(|| b_i - A x_i ||_1) 2.756527e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.463826e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.630608e-08 max(|| b_i - A x_i ||_1) 2.756527e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.463826e-01 (SUCCESS) Test #3118: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrcpend .................***Timeout 419.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.674826e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.443660e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.044625e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.195586e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.524503e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.138642e+00 s Time to initialize coeftab 7.678060e-01 s Time to factorize 4.544075e+01 s (225.00 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.337313e+01 s - iteration 1 : total iteration time 32.6 s error 4.9469e-12 Time for refinement 5.476417e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.531730e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.531730e-08 max(|| b_i - A x_i ||_1) 2.728129e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.428141e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.728129e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.428141e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.531730e-08 max(|| b_i - A x_i ||_1) 2.728129e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.428141e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.531730e-08 max(|| b_i - A x_i ||_1) 2.728129e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.428141e-01 (SUCCESS) Test #3122: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpend .....***Timeout 419.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.555097e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.415393e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.970073e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.692640e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.141181e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.159555e+00 s Time to initialize coeftab 8.890025e-01 s Time to factorize 4.262618e+01 s (239.86 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 2.199403e+01 s - iteration 1 : total iteration time 14.1 s error 5.0144e-12 Time for refinement 3.364844e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.501279e-08 max(|| b_i - A x_i ||_1) 2.713422e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.409660e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.501279e-08 max(|| b_i - A x_i ||_1) 2.713422e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.409660e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.501279e-08 max(|| b_i - A x_i ||_1) 2.713422e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.409660e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.501279e-08 max(|| b_i - A x_i ||_1) 2.713422e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.409660e-01 (SUCCESS) Test #3124: mpi_dst_example_simple_lap_s_facto2_sched4_not_tqrcpend .................***Timeout 419.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.586834e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.373529e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.361817e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.008590e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.972959e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.526218e+00 s Time to initialize coeftab 5.017598e-01 s Time to factorize 3.760205e+01 s (271.91 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.625790e+01 s - iteration 1 : total iteration time 25.2 s error 3.4588e-12 Time for refinement 4.387381e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.775590e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.775590e-08 max(|| b_i - A x_i ||_1) 2.767457e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.477560e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.767457e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.477560e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.775590e-08 max(|| b_i - A x_i ||_1) 2.767457e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.477560e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.775590e-08 max(|| b_i - A x_i ||_1) 2.767457e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.477560e-01 (SUCCESS) Test #3125: mpi_dst_example_simple_lap_s_facto2_sched4_kway_tqrcpbegin ..............***Timeout 419.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.351047e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.087409e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.635262e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.629126e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.793143e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.384924e+01 s Time to initialize coeftab 9.497705e+00 s Time to factorize 4.851448e+01 s (210.75 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 112 Ko / 113 Ko Time to solve 8.111002e+00 s - iteration 1 : total iteration time 34.4 s error 3.672e-11 Time for refinement 5.337072e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.929990e-08 max(|| b_i - A x_i ||_1) 2.888254e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629351e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.929990e-08 max(|| b_i - A x_i ||_1) 2.888254e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629351e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.929990e-08 max(|| b_i - A x_i ||_1) 2.888254e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629351e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.929990e-08 max(|| b_i - A x_i ||_1) 2.888254e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.629351e-01 (SUCCESS) Test #3128: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpend .....***Timeout 419.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.009935e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.497704e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.404709e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.825863e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.397577e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.049325e+00 s Time to initialize coeftab 1.771369e+00 s Time to factorize 4.976545e+01 s (205.45 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.905945e+01 s - iteration 1 : total iteration time 11.3 s error 6.1369e-13 Time for refinement 2.947554e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.620663e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.620663e-08 max(|| b_i - A x_i ||_1) 2.727407e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.427233e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.620663e-08 max(|| b_i - A x_i ||_1) 2.727407e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.427233e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.620663e-08 max(|| b_i - A x_i ||_1) 2.727407e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.427233e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.727407e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.427233e-01 (SUCCESS) Test #3132: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrrtend ................***Timeout 419.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.247166e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.152777e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.194443e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.543100e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.721411e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.481325e+00 s Time to initialize coeftab 1.838998e+00 s Time to factorize 2.115402e+01 s (483.32 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Test #3133: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtbegin ...***Timeout 419.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.114313e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.206784e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.037229e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.161355e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.559673e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.632693e+00 s Time to initialize coeftab 6.822077e+00 s Time to factorize 4.521557e+01 s (226.12 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 9.316008e+00 s - iteration 1 : total iteration time 16.5 s error 3.836e-11 Time for refinement 3.779452e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.252776e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.252776e-08 max(|| b_i - A x_i ||_1) 3.050610e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.833367e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.050610e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.833367e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.252776e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.252776e-08 max(|| b_i - A x_i ||_1) 3.050610e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.833367e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.050610e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.833367e-01 (SUCCESS) Test #3134: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtend .....***Timeout 418.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.534611e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.501625e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.777439e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.317725e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.934155e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.770243e+00 s Time to initialize coeftab 2.976156e+00 s Time to factorize 2.206919e+01 s (463.28 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 9.161023e+00 s - iteration 1 : total iteration time 22.8 s error 3.4564e-12 Time for refinement 4.170557e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.622262e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.622262e-08 max(|| b_i - A x_i ||_1) 2.739598e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.442553e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 2.739598e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.442553e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.622262e-08 max(|| b_i - A x_i ||_1) 2.739598e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.442553e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.622262e-08 max(|| b_i - A x_i ||_1) 2.739598e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.442553e-01 (SUCCESS) Test #3135: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpilu0 ...............***Timeout 418.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.874067e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.295651e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.352439e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.398291e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.121971e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.400354e+00 s Time to initialize coeftab 2.598056e+00 s Time to factorize 5.874935e+01 s (174.03 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Time to solve 1.997226e+01 s - iteration 1 : total iteration time 23.7 s error 7.7939e-11 Time for refinement 4.596297e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.318233e-08 max(|| b_i - A x_i ||_1) 3.042934e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.823721e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.318233e-08 max(|| b_i - A x_i ||_1) 3.042934e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.823721e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.318233e-08 max(|| b_i - A x_i ||_1) 3.042934e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.823721e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.318233e-08 max(|| b_i - A x_i ||_1) 3.042934e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.823721e-01 (SUCCESS) Test #3138: mpi_dst_example_simple_lap_d_facto0_sched4_not_svdend ...................***Timeout 418.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.469324e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.022084e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.748167e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.517802e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.257802e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.642319e+00 s Time to initialize coeftab 1.475128e+00 s Time to factorize 4.058576e+01 s (127.72 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.025919e+01 s - iteration 1 : total iteration time 36.3 s error 5.9113e-15 Time for refinement 5.585635e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.912125e-15 max(|| b_i - A x_i ||_1) 3.936360e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.946375e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.912125e-15 max(|| b_i - A x_i ||_1) 3.936360e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.946375e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.912125e-15 max(|| b_i - A x_i ||_1) 3.936360e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.946375e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.912125e-15 max(|| b_i - A x_i ||_1) 3.936360e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.946375e-03 (SUCCESS) Test #3143: mpi_dst_example_simple_lap_d_facto0_sched4_not_pqrcpbegin ...............***Timeout 418.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.820357e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.927645e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.074987e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.635694e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.158897e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.548295e+00 s Time to initialize coeftab 3.062035e+00 s Time to factorize 3.474704e+01 s (149.19 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 6.579490e+00 s - iteration 1 : total iteration time 46.3 s error 1.5987e-14 Time for refinement 5.778922e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.599363e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.599363e-14 max(|| b_i - A x_i ||_1) 3.063341e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.849351e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 3.063341e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.849351e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.599363e-14 max(|| b_i - A x_i ||_1) 3.063341e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.849351e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.599363e-14 max(|| b_i - A x_i ||_1) 3.063341e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.849351e-02 (SUCCESS) Test #3144: mpi_dst_example_simple_lap_d_facto0_sched4_not_pqrcpend .................***Timeout 418.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.502351e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.202881e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.222778e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.893821e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.773630e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.413156e+00 s Time to initialize coeftab 1.216919e+00 s Time to factorize 3.184513e+01 s (162.78 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.086857e+01 s - iteration 1 : total iteration time 12 s error 4.9314e-15 Time for refinement 2.590738e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.930008e-15 max(|| b_i - A x_i ||_1) 3.918115e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.923449e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.930008e-15 max(|| b_i - A x_i ||_1) 3.918115e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.923449e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.930008e-15 max(|| b_i - A x_i ||_1) 3.918115e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.923449e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.930008e-15 max(|| b_i - A x_i ||_1) 3.918115e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.923449e-03 (SUCCESS) Test #3147: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpbegin ...***Timeout 417.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.981057e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.779069e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.504854e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.295365e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.329289e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.579552e+00 s Time to initialize coeftab 1.754776e+00 s Time to factorize 7.006237e+01 s (73.99 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.856700e+01 s - iteration 1 : total iteration time 13.5 s error 1.7002e-14 Time for refinement 2.604467e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.700653e-14 max(|| b_i - A x_i ||_1) 3.087774e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.880053e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.700653e-14 max(|| b_i - A x_i ||_1) 3.087774e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.880053e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.700653e-14 max(|| b_i - A x_i ||_1) 3.087774e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.880053e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.700653e-14 max(|| b_i - A x_i ||_1) 3.087774e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.880053e-02 (SUCCESS) Test #3150: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrcpend .................***Timeout 417.29 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.661883e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.185782e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.973358e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.054359e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.670165e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.901901e+00 s Time to initialize coeftab 9.786456e-01 s Time to factorize 4.874178e+01 s (106.35 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Test #3160: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpend .....***Timeout 417.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.603629e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.263084e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.073421e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.946884e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.963805e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.347748e-02 s Time to initialize coeftab 2.812032e-01 s Time to factorize 3.545611e+01 s (146.20 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.229514e+01 s - iteration 1 : total iteration time 37 s error 1.3307e-15 Time for refinement 5.973161e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.332362e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.332362e-15 max(|| b_i - A x_i ||_1) 1.607210e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.019598e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.607210e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.019598e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.332362e-15 max(|| b_i - A x_i ||_1) 1.607210e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.019598e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.332362e-15 max(|| b_i - A x_i ||_1) 1.607210e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.019598e-03 (SUCCESS) Test #3189: mpi_dst_example_simple_lap_d_facto1_sched4_kway_tqrcpbegin ..............***Timeout 416.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.449569e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.007027e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.926306e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.368734e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.322558e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.291565e+00 s Time to initialize coeftab 4.990852e+00 s Time to factorize 9.604848e+01 s (55.80 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.360373e+01 s - iteration 1 : total iteration time 35.3 s error 4.5852e-14 Time for refinement 5.429883e+01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.584894e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.584894e-14 max(|| b_i - A x_i ||_1) 9.403369e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.181614e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.584894e-14 max(|| b_i - A x_i ||_1) 9.403369e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.181614e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.584894e-14 max(|| b_i - A x_i ||_1) 9.403369e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.181614e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 9.403369e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.181614e-01 (SUCCESS) Start 3189: mpi_dst_example_simple_lap_d_facto1_sched4_kway_tqrcpbegin Test #3190: mpi_dst_example_simple_lap_d_facto1_sched4_kway_tqrcpend ................***Timeout 415.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.017801e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.370532e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.826030e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.040385e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.444666e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.855656e+00 s Time to initialize coeftab 1.217047e+00 s Time to factorize 1.972397e+01 s (271.70 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.961708e+01 s - iteration 1 : total iteration time 16.6 s error 5.6379e-15 Time for refinement 2.973800e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.640955e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.640955e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.640955e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.640955e-15 max(|| b_i - A x_i ||_1) 2.920551e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.669923e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.920551e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.669923e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.920551e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.669923e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.920551e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.669923e-03 (SUCCESS) Start 3190: mpi_dst_example_simple_lap_d_facto1_sched4_kway_tqrcpend Test #3191: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpbegin ...***Timeout 415.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.708851e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.769366e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.495106e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.125352e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.629523e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.476310e+00 s Time to initialize coeftab 4.660081e+00 s Time to factorize 4.808820e+01 s (111.44 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.4 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3191: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpbegin Test #3192: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpend .....***Timeout 414.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.936687e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.051794e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.674841e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.833413e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.309184e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.310431e+00 s Time to initialize coeftab 1.175946e+00 s Time to factorize 4.240903e+01 s (126.37 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3192: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpend Test #3193: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrrtbegin ...............***Timeout 412.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.529224e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.712422e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.429943e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.234870e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.287529e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.773156e-02 s Time to initialize coeftab 4.141056e+00 s Time to factorize 4.573247e+01 s (117.18 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 3.611926e+01 s - iteration 1 : total iteration time 23.4 s error 1.5972e-12 - iteration 2 : total iteration time 16.1 s error 6.1556e-18 Time for refinement 5.590591e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.222073e-16 max(|| b_i - A x_i ||_1) 6.083878e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.644915e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.222073e-16 max(|| b_i - A x_i ||_1) 6.083878e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.644915e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.222073e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.222073e-16 max(|| b_i - A x_i ||_1) 6.083878e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.644915e-04 (SUCCESS) max(|| b_i - A x_i ||_1) 6.083878e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.644915e-04 (SUCCESS) Start 3193: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrrtbegin Test #3194: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrrtend .................***Timeout 411.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.259256e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.720141e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.125852e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.393098e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.407768e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.184275e+00 s Time to initialize coeftab 1.442507e+00 s Time to factorize 3.534437e+01 s (151.62 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3194: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrrtend Test #3195: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrrtbegin ..............***Timeout 411.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.599337e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.573018e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.070508e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.819178e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.300282e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.389733e+00 s Time to initialize coeftab 4.127608e+00 s Time to factorize 4.581263e+01 s (116.98 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3195: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrrtbegin Test #3196: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrrtend ................***Timeout 411.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.563259e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.268724e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.060087e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.110074e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.710628e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.660756e+00 s Time to initialize coeftab 1.367790e+00 s Time to factorize 3.501283e+01 s (153.06 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.664172e+01 s - iteration 1 : total iteration time 44.7 s error 2.5419e-15 Time for refinement 6.127639e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.550202e-15 max(|| b_i - A x_i ||_1) 4.088585e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.137658e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.550202e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.550202e-15 max(|| b_i - A x_i ||_1) 4.088585e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.137658e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 4.088585e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.137658e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.550202e-15 max(|| b_i - A x_i ||_1) 4.088585e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.137658e-03 (SUCCESS) Start 3196: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrrtend Test #3197: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtbegin ...***Timeout 411.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.891684e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.162167e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.555838e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.909180e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.221321e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.264243e+01 s Time to initialize coeftab 4.472183e+00 s Time to factorize 3.983089e+01 s (134.55 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.566677e+01 s - iteration 1 : total iteration time 37.3 s error 5.2936e-13 Time for refinement 5.202542e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.293572e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.293572e-13 max(|| b_i - A x_i ||_1) 9.755154e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.225819e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.755154e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.225819e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.293572e-13 max(|| b_i - A x_i ||_1) 9.755154e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.225819e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.293572e-13 max(|| b_i - A x_i ||_1) 9.755154e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.225819e+00 (SUCCESS) Start 3197: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtbegin Test #3198: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtend .....***Timeout 411.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.486509e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.362102e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.910954e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.680786e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.795516e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.964659e+00 s Time to initialize coeftab 1.194193e+00 s Time to factorize 4.638606e+01 s (115.53 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.745229e+01 s - iteration 1 : total iteration time 15.5 s error 4.0173e-13 Time for refinement 4.425659e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.017388e-13 max(|| b_i - A x_i ||_1) 2.706967e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.401537e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.017388e-13 max(|| b_i - A x_i ||_1) 2.706967e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.401537e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.017388e-13 max(|| b_i - A x_i ||_1) 2.706967e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.401537e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.017388e-13 max(|| b_i - A x_i ||_1) 2.706967e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.401537e-01 (SUCCESS) Start 3198: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtend Test #3199: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpilu0 ...............***Timeout 412.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 3: 200 660 2: 200 760 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.907418e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.241699e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.443794e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.238071e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.508864e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.923210e+00 s Time to initialize coeftab 1.065351e+00 s Time to factorize 5.254947e+01 s (101.98 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.528054e+01 s - iteration 1 : total iteration time 24.7 s error 1.5771e-14 Time for refinement 4.749351e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.577669e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.577669e-14 max(|| b_i - A x_i ||_1) 2.543033e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.195540e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.577669e-14 max(|| b_i - A x_i ||_1) 2.543033e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.195540e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.543033e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.195540e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.577669e-14 max(|| b_i - A x_i ||_1) 2.543033e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.195540e-02 (SUCCESS) Start 3199: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpilu0 Test #3200: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpilu1 ...............***Timeout 413.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.718359e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.844489e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.116478e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.184231e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.604233e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.281664e+00 s Time to initialize coeftab 6.108775e-01 s Time to factorize 8.027084e+01 s (66.76 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.346852e+01 s - iteration 1 : total iteration time 15.1 s error 1.1803e-14 Time for refinement 3.006281e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.179975e-14 max(|| b_i - A x_i ||_1) 2.045999e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.570973e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.179975e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.179975e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.179975e-14 max(|| b_i - A x_i ||_1) 2.045999e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.570973e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.045999e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.570973e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 2.045999e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.570973e-02 (SUCCESS) Start 3200: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpilu1 Test #3201: mpi_dst_example_simple_lap_d_facto2_sched4_not_svdbegin .................***Timeout 413.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.922273e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.194293e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.491878e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.706856e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.511878e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.085541e+00 s Time to initialize coeftab 7.010939e+00 s Time to factorize 1.013381e+02 s (100.89 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.375573e+01 s - iteration 1 : total iteration time 23.9 s error 2.0637e-14 Time for refinement 4.796667e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.063609e-14 max(|| b_i - A x_i ||_1) 3.845814e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.832596e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.063609e-14 max(|| b_i - A x_i ||_1) 3.845814e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.832596e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.063609e-14 max(|| b_i - A x_i ||_1) 3.845814e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.832596e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.063609e-14 max(|| b_i - A x_i ||_1) 3.845814e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.832596e-02 (SUCCESS) Start 3201: mpi_dst_example_simple_lap_d_facto2_sched4_not_svdbegin Test #3202: mpi_dst_example_simple_lap_d_facto2_sched4_not_svdend ...................***Timeout 414.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.198216e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.801087e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.831937e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.755715e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.528076e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.492689e+00 s Time to initialize coeftab 5.189000e-01 s Time to factorize 5.424507e+01 s (188.48 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.708085e+01 s - iteration 1 : total iteration time 14.8 s error 4.5032e-16 Time for refinement 2.944449e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.686124e-16 max(|| b_i - A x_i ||_1) 1.061575e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.333961e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.686124e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.686124e-16 max(|| b_i - A x_i ||_1) 1.061575e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.333961e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 1.061575e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.333961e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.686124e-16 max(|| b_i - A x_i ||_1) 1.061575e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.333961e-03 (SUCCESS) Start 3202: mpi_dst_example_simple_lap_d_facto2_sched4_not_svdend Test #3203: mpi_dst_example_simple_lap_d_facto2_sched4_kway_svdbegin ................***Timeout 415.11 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.818221e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.700649e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.198246e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.659594e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.884401e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.557463e+00 s Time to initialize coeftab 7.110401e+00 s Time to factorize 9.185303e+01 s (111.31 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 8.417862e+00 s - iteration 1 : total iteration time 46.8 s error 2.1802e-14 Time for refinement 6.425586e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.180886e-14 max(|| b_i - A x_i ||_1) 4.314184e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.421143e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.180886e-14 max(|| b_i - A x_i ||_1) 4.314184e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.421143e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.180886e-14 max(|| b_i - A x_i ||_1) 4.314184e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.421143e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.180886e-14 max(|| b_i - A x_i ||_1) 4.314184e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.421143e-02 (SUCCESS) Start 3203: mpi_dst_example_simple_lap_d_facto2_sched4_kway_svdbegin Test #3204: mpi_dst_example_simple_lap_d_facto2_sched4_kway_svdend ..................***Timeout 414.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.104235e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.924281e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.669135e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.524676e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.900921e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.180946e+00 s Time to initialize coeftab 3.572344e+00 s Time to factorize 5.194229e+01 s (196.84 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.633973e+01 s - iteration 1 : total iteration time 18.9 s error 1.582e-15 Time for refinement 3.614903e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.584150e-15 max(|| b_i - A x_i ||_1) 2.239978e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.814725e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.584150e-15 max(|| b_i - A x_i ||_1) 2.239978e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.814725e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.584150e-15 max(|| b_i - A x_i ||_1) 2.239978e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.814725e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.584150e-15 max(|| b_i - A x_i ||_1) 2.239978e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.814725e-03 (SUCCESS) Start 3204: mpi_dst_example_simple_lap_d_facto2_sched4_kway_svdend Test #3205: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_svdbegin .....***Timeout 414.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.882165e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.349469e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.376807e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.308847e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.698019e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.475317e+00 s Time to initialize coeftab 6.950162e+00 s Time to factorize 6.418453e+01 s (159.29 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Start 3205: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_svdbegin Test #3206: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_svdend .......***Timeout 414.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.325671e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.124844e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.430608e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.267958e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.507617e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.199976e+00 s Time to initialize coeftab 9.645615e-01 s Time to factorize 3.662502e+01 s (279.16 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.361357e+01 s - iteration 1 : total iteration time 11.3 s error 1.9371e-15 Time for refinement 3.305373e+01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.938845e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.938845e-15 max(|| b_i - A x_i ||_1) 2.945712e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.701540e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.945712e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.701540e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.938845e-15 max(|| b_i - A x_i ||_1) 2.945712e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.701540e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.938845e-15 max(|| b_i - A x_i ||_1) 2.945712e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.701540e-03 (SUCCESS) Start 3206: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_svdend Test #3207: mpi_dst_example_simple_lap_d_facto2_sched4_not_pqrcpbegin ...............***Timeout 414.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.929264e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.045656e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.585673e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.371981e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.046792e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.646067e+00 s Time to initialize coeftab 7.926025e+00 s Time to factorize 5.647181e+01 s (181.05 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 3207: mpi_dst_example_simple_lap_d_facto2_sched4_not_pqrcpbegin Test #3208: mpi_dst_example_simple_lap_d_facto2_sched4_not_pqrcpend .................***Timeout 414.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.515833e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.481882e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.045115e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.061382e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.302689e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.293818e-01 s Time to initialize coeftab 1.328999e+00 s Time to factorize 3.307592e+01 s (309.11 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.376852e+01 s - iteration 1 : total iteration time 30.3 s error 2.4911e-16 Time for refinement 4.984986e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.780244e-16 max(|| b_i - A x_i ||_1) 7.567320e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.508988e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.780244e-16 max(|| b_i - A x_i ||_1) 7.567320e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.508988e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.780244e-16 max(|| b_i - A x_i ||_1) 7.567320e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.508988e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.780244e-16 max(|| b_i - A x_i ||_1) 7.567320e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.508988e-04 (SUCCESS) Start 3208: mpi_dst_example_simple_lap_d_facto2_sched4_not_pqrcpend Test #3209: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpbegin ..............***Timeout 414.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.932723e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.297496e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.426183e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.053562e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.968435e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.675606e+00 s Time to initialize coeftab 2.581373e+00 s Time to factorize 5.010023e+01 s (204.08 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.336256e+01 s - iteration 1 : total iteration time 12.5 s error 1.7906e-14 Time for refinement 2.507041e+01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.790990e-14 max(|| b_i - A x_i ||_1) 3.214441e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.039221e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.790990e-14 max(|| b_i - A x_i ||_1) 3.214441e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.039221e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.790990e-14 max(|| b_i - A x_i ||_1) 3.214441e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.039221e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.790990e-14 max(|| b_i - A x_i ||_1) 3.214441e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.039221e-02 (SUCCESS) Start 3209: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpbegin Test #3211: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpbegin ...***Timeout 413.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.138727e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.359279e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.435704e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.223154e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.237270e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.316233e+00 s Time to initialize coeftab 4.078722e+00 s Time to factorize 4.694703e+01 s (217.78 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.284414e+01 s - iteration 1 : total iteration time 26 s error 1.6684e-14 Time for refinement 4.987714e+01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.668373e-14 max(|| b_i - A x_i ||_1) 3.116701e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.916403e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.668373e-14 max(|| b_i - A x_i ||_1) 3.116701e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.916403e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.668373e-14 max(|| b_i - A x_i ||_1) 3.116701e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.916403e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.668373e-14 max(|| b_i - A x_i ||_1) 3.116701e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.916403e-02 (SUCCESS) Start 3211: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpbegin Test #3212: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpend .....***Timeout 413.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.717076e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.128774e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.954762e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.771297e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.329363e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.570366e+00 s Time to initialize coeftab 7.825775e-01 s Time to factorize 4.701856e+01 s (217.45 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Start 3212: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpend Test #3213: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrcpbegin ...............***Timeout 413.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.174990e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.527648e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.386971e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.102804e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.365541e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.434021e+00 s Time to initialize coeftab 4.504861e+00 s Time to factorize 8.127035e+01 s (125.80 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 176 Ko / 177 Ko ------------------------------------------------ Total 224 Ko / 226 Ko Start 3213: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrcpbegin Test #3214: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrcpend .................***Timeout 413.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.554461e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.439398e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.813036e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.275784e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.807890e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.504675e-02 s Time to initialize coeftab 1.969355e-01 s Time to factorize 2.590909e+01 s (394.62 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.426762e+01 s - iteration 1 : total iteration time 21.4 s error 9.3636e-16 Time for refinement 4.471788e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.507351e-16 max(|| b_i - A x_i ||_1) 1.606514e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.018723e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.507351e-16 max(|| b_i - A x_i ||_1) 1.606514e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.018723e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.507351e-16 max(|| b_i - A x_i ||_1) 1.606514e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.018723e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.507351e-16 max(|| b_i - A x_i ||_1) 1.606514e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.018723e-03 (SUCCESS) Start 3214: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrcpend Test #3215: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrcpbegin ..............***Timeout 413.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.261873e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.172684e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.997390e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.532432e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.880381e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.250721e+00 s Time to initialize coeftab 6.628731e+00 s Time to factorize 6.446432e+01 s (158.60 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Time to solve 2.036305e+01 s - iteration 1 : total iteration time 27.5 s error 1.1043e-13 Time for refinement 5.140925e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.104298e-13 max(|| b_i - A x_i ||_1) 1.551959e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.950170e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.104298e-13 max(|| b_i - A x_i ||_1) 1.551959e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.950170e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.104298e-13 max(|| b_i - A x_i ||_1) 1.551959e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.950170e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.104298e-13 max(|| b_i - A x_i ||_1) 1.551959e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.950170e-01 (SUCCESS) Start 3215: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrcpbegin Test #3216: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrcpend ................***Timeout 413.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.317425e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.670367e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.331547e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.056375e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.603846e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.351511e-02 s Time to initialize coeftab 3.918167e-01 s Time to factorize 2.713508e+01 s (376.79 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.261802e+01 s - iteration 1 : total iteration time 18.4 s error 2.0976e-15 Time for refinement 4.125324e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.096447e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.096447e-15 max(|| b_i - A x_i ||_1) 2.833141e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.560085e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.833141e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.560085e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.096447e-15 max(|| b_i - A x_i ||_1) 2.833141e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.560085e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.096447e-15 max(|| b_i - A x_i ||_1) 2.833141e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.560085e-03 (SUCCESS) Start 3216: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrcpend Test #3218: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpend .....***Timeout 413.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.854240e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.159910e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.442631e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.640996e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.619754e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.289668e+00 s Time to initialize coeftab 1.100459e+00 s Time to factorize 4.589592e+01 s (222.77 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.175870e+01 s - iteration 1 : total iteration time 16.1 s error 2.349e-15 Time for refinement 4.443570e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.348604e-15 max(|| b_i - A x_i ||_1) 2.825605e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.550615e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.348604e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.348604e-15 max(|| b_i - A x_i ||_1) 2.825605e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.550615e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 2.825605e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.550615e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.348604e-15 max(|| b_i - A x_i ||_1) 2.825605e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.550615e-03 (SUCCESS) Start 3218: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpend Test #3219: mpi_dst_example_simple_lap_d_facto2_sched4_not_tqrcpbegin ...............***Timeout 414.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.432385e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.950580e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.283298e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.772818e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.712320e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.113303e+00 s Time to initialize coeftab 6.585002e+00 s Time to factorize 8.745010e+01 s (116.91 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Start 3219: mpi_dst_example_simple_lap_d_facto2_sched4_not_tqrcpbegin Test #3220: mpi_dst_example_simple_lap_d_facto2_sched4_not_tqrcpend .................***Timeout 414.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.310834e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.264558e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.112011e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.223386e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.492915e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.512363e+00 s Time to initialize coeftab 9.938437e-01 s Time to factorize 4.221850e+01 s (242.17 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.079743e+01 s - iteration 1 : total iteration time 32.1 s error 3.8404e-16 Time for refinement 4.858691e+01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.974435e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.974435e-16 max(|| b_i - A x_i ||_1) 8.717807e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.095467e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.717807e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.095467e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.974435e-16 max(|| b_i - A x_i ||_1) 8.717807e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.095467e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.974435e-16 max(|| b_i - A x_i ||_1) 8.717807e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.095467e-03 (SUCCESS) Start 3220: mpi_dst_example_simple_lap_d_facto2_sched4_not_tqrcpend Test #3221: mpi_dst_example_simple_lap_d_facto2_sched4_kway_tqrcpbegin ..............***Timeout 414.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.305629e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.267909e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.467908e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.628239e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.649448e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.796368e+00 s Time to initialize coeftab 6.789105e+00 s Time to factorize 7.233016e+01 s (141.35 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 225 Ko / 226 Ko Time to solve 1.476018e+01 s - iteration 1 : total iteration time 32.5 s error 6.2865e-14 Time for refinement 5.298420e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.286943e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.286943e-14 max(|| b_i - A x_i ||_1) 1.066551e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.340214e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.286943e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.286943e-14 max(|| b_i - A x_i ||_1) 1.066551e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.340214e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 1.066551e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.340214e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 1.066551e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.340214e-01 (SUCCESS) Start 3221: mpi_dst_example_simple_lap_d_facto2_sched4_kway_tqrcpbegin Test #3222: mpi_dst_example_simple_lap_d_facto2_sched4_kway_tqrcpend ................***Timeout 414.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.792131e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.437883e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.423335e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.799911e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.936277e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.685932e+00 s Time to initialize coeftab 8.096638e-01 s Time to factorize 2.744100e+01 s (372.59 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.739677e+01 s - iteration 1 : total iteration time 23 s error 1.7792e-16 Time for refinement 4.529146e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.138083e-16 max(|| b_i - A x_i ||_1) 6.431886e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.082218e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.138083e-16 max(|| b_i - A x_i ||_1) 6.431886e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.082218e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.138083e-16 max(|| b_i - A x_i ||_1) 6.431886e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.082218e-04 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.138083e-16 max(|| b_i - A x_i ||_1) 6.431886e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.082218e-04 (SUCCESS) Start 3222: mpi_dst_example_simple_lap_d_facto2_sched4_kway_tqrcpend Test #3223: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpbegin ...***Timeout 414.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.956199e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.134388e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.196854e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.131533e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.384247e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.201851e+00 s Time to initialize coeftab 6.360848e+00 s Time to factorize 1.038511e+02 s (98.45 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 176 Ko / 177 Ko ------------------------------------------------ Total 224 Ko / 226 Ko Start 3223: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpbegin Test #3224: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpend .....***Timeout 414.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.361311e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.463663e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.268072e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.276671e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.260741e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.378292e+00 s Time to initialize coeftab 1.807652e+00 s Time to factorize 3.349119e+01 s (305.28 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.031911e+01 s - iteration 1 : total iteration time 34 s error 6.7197e-15 Time for refinement 5.414311e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.725285e-15 max(|| b_i - A x_i ||_1) 5.014773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.301493e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.725285e-15 max(|| b_i - A x_i ||_1) 5.014773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.301493e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.725285e-15 max(|| b_i - A x_i ||_1) 5.014773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.301493e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.725285e-15 max(|| b_i - A x_i ||_1) 5.014773e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.301493e-03 (SUCCESS) Start 3224: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpend Test #3225: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrrtbegin ...............***Timeout 414.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.570106e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.113369e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.959175e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.684262e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.164145e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.292871e+01 s Time to initialize coeftab 6.808085e+00 s Time to factorize 7.113892e+01 s (143.72 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.085337e+01 s - iteration 1 : total iteration time 5 s error 9.7523e-13 Time for refinement 1.982070e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.752314e-13 max(|| b_i - A x_i ||_1) 1.481399e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.861505e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.752314e-13 max(|| b_i - A x_i ||_1) 1.481399e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.861505e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.752314e-13 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.752314e-13 max(|| b_i - A x_i ||_1) 1.481399e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.861505e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.481399e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.861505e+00 (SUCCESS) Start 3225: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrrtbegin Test #3226: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrrtend .................***Timeout 414.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.175441e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.213982e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.890387e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.325505e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.325403e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.060943e+00 s Time to initialize coeftab 1.429334e+00 s Time to factorize 4.106176e+01 s (249.00 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.653811e+01 s - iteration 1 : total iteration time 13.2 s error 3.2335e-13 Time for refinement 2.795977e+01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.233505e-13 max(|| b_i - A x_i ||_1) 2.808318e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.528893e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.233505e-13 max(|| b_i - A x_i ||_1) 2.808318e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.528893e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.233505e-13 max(|| b_i - A x_i ||_1) 2.808318e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.528893e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.233505e-13 max(|| b_i - A x_i ||_1) 2.808318e-14 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.528893e-01 (SUCCESS) Start 3226: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrrtend Test #3227: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrrtbegin ..............***Timeout 413.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.925626e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.181861e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.611278e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.443925e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.996523e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.377528e+00 s Time to initialize coeftab 3.409189e+00 s Time to factorize 7.104826e+01 s (143.91 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Start 3227: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrrtbegin Test #3228: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrrtend ................***Timeout 411.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.676983e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.999216e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.591593e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.879800e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.919621e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.363047e-01 s Time to initialize coeftab 7.223757e-01 s Time to factorize 4.205607e+01 s (243.11 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.099542e+01 s - iteration 1 : total iteration time 13 s error 2.593e-14 Time for refinement 3.480500e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.592919e-14 max(|| b_i - A x_i ||_1) 3.734064e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.692172e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.592919e-14 max(|| b_i - A x_i ||_1) 3.734064e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.692172e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.592919e-14 max(|| b_i - A x_i ||_1) 3.734064e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.692172e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.592919e-14 max(|| b_i - A x_i ||_1) 3.734064e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.692172e-02 (SUCCESS) Start 3228: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrrtend Test #3229: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtbegin ...***Timeout 410.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.057268e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.060195e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.013064e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.753849e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.097937e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.163236e+00 s Time to initialize coeftab 4.040663e+00 s Time to factorize 1.040515e+02 s (98.26 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 2.973999e+01 s - iteration 1 : total iteration time 5.72 s error 7.4463e-13 Time for refinement 1.500294e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.446343e-13 max(|| b_i - A x_i ||_1) 1.613804e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.027884e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.446343e-13 max(|| b_i - A x_i ||_1) 1.613804e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.027884e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.446343e-13 max(|| b_i - A x_i ||_1) 1.613804e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.027884e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.446343e-13 max(|| b_i - A x_i ||_1) 1.613804e-13 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.027884e+00 (SUCCESS) Start 3229: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtbegin Test #3230: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtend .....***Timeout 410.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.571016e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.463279e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.364068e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.285967e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.280441e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.247707e+00 s Time to initialize coeftab 1.967010e+00 s Time to factorize 2.799642e+01 s (365.20 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.551681e+01 s - iteration 1 : total iteration time 23.2 s error 3.2513e-14 Time for refinement 3.991390e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.251340e-14 max(|| b_i - A x_i ||_1) 2.546717e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.200169e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.251340e-14 max(|| b_i - A x_i ||_1) 2.546717e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.200169e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.251340e-14 max(|| b_i - A x_i ||_1) 2.546717e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.200169e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.251340e-14 max(|| b_i - A x_i ||_1) 2.546717e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.200169e-02 (SUCCESS) Start 3230: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtend Test #3231: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpilu0 ...............***Timeout 410.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.720153e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.412337e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.230647e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.948986e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.039951e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.411975e+00 s Time to initialize coeftab 5.714539e+00 s Time to factorize 5.162901e+01 s (198.03 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 1.003158e+01 s - iteration 1 : total iteration time 36.3 s error 1.0335e-14 Time for refinement 5.169315e+01 s || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.033057e-14 max(|| b_i - A x_i ||_1) 1.608827e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.021629e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.033057e-14 max(|| b_i - A x_i ||_1) 1.608827e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.021629e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.033057e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.033057e-14 max(|| b_i - A x_i ||_1) 1.608827e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.021629e-02 (SUCCESS) max(|| b_i - A x_i ||_1) 1.608827e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.021629e-02 (SUCCESS) Start 3231: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpilu0 Test #3232: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpilu1 ...............***Timeout 410.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.915362e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.518432e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.846655e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.215177e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.982359e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.666592e+00 s Time to initialize coeftab 1.040989e+00 s Time to factorize 5.520127e+01 s (185.22 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Start 3232: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpilu1 Test #3234: mpi_dst_example_simple_lap_c_facto0_sched4_not_svdend ...................***Timeout 411.97 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.527132e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.809342e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.512803e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.047292e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.513461e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.029813e+00 s Time to initialize coeftab 6.153030e-01 s Time to factorize 1.907298e+02 s (108.89 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Start 3234: mpi_dst_example_simple_lap_c_facto0_sched4_not_svdend Test #3240: mpi_dst_example_simple_lap_c_facto0_sched4_not_pqrcpend .................***Timeout 454.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.775132e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.992115e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.242327e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.732588e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.300158e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.987917e+00 s Time to initialize coeftab 2.966510e+00 s Time to factorize 3.086669e+01 s (672.82 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 8.846104e+00 s Time for refinement 6.198252e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.995595e-07 max(|| b_i - A x_i ||_1) 1.172510e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.958650e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.995595e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.995595e-07 max(|| b_i - A x_i ||_1) 1.172510e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.958650e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.172510e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.958650e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.995595e-07 max(|| b_i - A x_i ||_1) 1.172510e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.958650e+00 (SUCCESS) Start 3240: mpi_dst_example_simple_lap_c_facto0_sched4_not_pqrcpend Test #3242: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpend ................***Timeout 454.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.905142e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.711605e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.490793e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.735945e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.764640e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.095486e+00 s Time to initialize coeftab 1.408008e+00 s Time to factorize 4.712326e+01 s (440.71 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 5.284905e+00 s Time for refinement 2.879983e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.815651e-07 max(|| b_i - A x_i ||_1) 1.062158e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.680195e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.815651e-07 max(|| b_i - A x_i ||_1) 1.062158e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.680195e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.815651e-07 max(|| b_i - A x_i ||_1) 1.062158e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.680195e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.815651e-07 max(|| b_i - A x_i ||_1) 1.062158e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.680195e+00 (SUCCESS) Start 3242: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpend Test #3243: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpbegin ...***Timeout 453.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.684714e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.210652e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.369081e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.246130e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.699368e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.474004e+00 s Time to initialize coeftab 4.130198e+00 s Time to factorize 1.151584e+02 s (180.34 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3243: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpbegin Test #3244: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpend .....***Timeout 452.95 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.627975e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.904467e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.496438e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.005098e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.657827e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.164384e+00 s Time to initialize coeftab 6.463367e-01 s Time to factorize 5.346215e+01 s (388.45 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3244: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpend Test #3246: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrcpend .................***Timeout 454.99 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.691526e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.092465e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.494458e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.409714e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.484002e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.623894e+00 s Time to initialize coeftab 6.618323e-01 s Time to factorize 6.993334e+01 s (296.96 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Start 3246: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrcpend Test #3248: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrcpend ................***Timeout 454.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.979624e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.300337e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.303991e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.840652e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.887103e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.751321e+00 s Time to initialize coeftab 7.315972e-01 s Time to factorize 1.195529e+02 s (173.71 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.362218e+01 s Time for refinement 1.363347e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.194737e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.194737e-07 max(|| b_i - A x_i ||_1) 9.148805e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.308562e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 9.148805e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.308562e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.194737e-07 max(|| b_i - A x_i ||_1) 9.148805e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.308562e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.194737e-07 max(|| b_i - A x_i ||_1) 9.148805e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.308562e+00 (SUCCESS) Start 3248: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrcpend Test #3249: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpbegin ...***Timeout 454.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.606930e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.762830e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.498618e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.847821e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.966398e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.617167e+00 s Time to initialize coeftab 5.422292e+00 s Time to factorize 2.025317e+02 s (102.54 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3249: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpbegin Test #3250: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpend .....***Timeout 454.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.858968e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.912459e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.715602e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.187010e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.145256e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.583682e+00 s Time to initialize coeftab 1.126599e+00 s Time to factorize 7.026110e+01 s (295.58 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.895645e+01 s - iteration 1 : total iteration time 18.5 s error 1.5747e-11 Time for refinement 4.584193e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.272790e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.272790e-08 max(|| b_i - A x_i ||_1) 3.181865e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.028952e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.272790e-08 max(|| b_i - A x_i ||_2 / || b_i ||_2) 6.272790e-08 max(|| b_i - A x_i ||_1) 3.181865e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.028952e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.181865e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.028952e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 3.181865e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 8.028952e-01 (SUCCESS) Start 3250: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpend Test #3254: mpi_dst_example_simple_lap_c_facto0_sched4_kway_tqrcpend ................***Timeout 456.46 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.342429e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.149536e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.666636e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.937380e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.216721e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.647897e-02 s Time to initialize coeftab 1.456914e+00 s Time to factorize 1.191116e+02 s (174.35 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.182896e+01 s Time for refinement 1.955530e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.242746e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.242746e-07 max(|| b_i - A x_i ||_1) 1.221606e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.082538e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.242746e-07 max(|| b_i - A x_i ||_1) 1.221606e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.082538e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.242746e-07 max(|| b_i - A x_i ||_1) 1.221606e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.082538e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.221606e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.082538e+00 (SUCCESS) Start 3254: mpi_dst_example_simple_lap_c_facto0_sched4_kway_tqrcpend Test #3256: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpend .....***Timeout 456.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.661449e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.134729e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.134191e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.737013e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.061541e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.643647e+00 s Time to initialize coeftab 1.404378e+00 s Time to factorize 1.332252e+02 s (155.88 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.335141e+01 s Time for refinement 2.932228e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.306385e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.306385e-07 max(|| b_i - A x_i ||_1) 1.252981e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.161707e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.252981e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.161707e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.306385e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 9.306385e-07 max(|| b_i - A x_i ||_1) 1.252981e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.161707e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 1.252981e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.161707e+00 (SUCCESS) Start 3256: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpend Test #3075: mpi_dst_example_simple_lap_s_facto1_sched4_kway_svdbegin ................***Timeout 444.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.978481e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.252844e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.758115e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.021396e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.116194e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.529739e+00 s Time to initialize coeftab 4.166361e+00 s Time to factorize 1.684609e+02 s (31.81 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 6.882579e+00 s Time for refinement 4.038299e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996109e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996109e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996109e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996109e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.924821e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.924821e-07 max(|| b_i - A x_i ||_1) 8.733476e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.097440e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.924821e-07 max(|| b_i - A x_i ||_1) 8.733476e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.097440e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.924821e-07 max(|| b_i - A x_i ||_1) 8.733476e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.097440e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.733476e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.097440e+00 (SUCCESS) Test #3111: mpi_dst_example_simple_lap_s_facto2_sched4_not_pqrcpbegin ...............***Timeout 437.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.192814e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.487527e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.784279e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.469200e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.849074e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.066831e+00 s Time to initialize coeftab 2.202278e+00 s Time to factorize 4.142500e+01 s (246.81 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Test #3126: mpi_dst_example_simple_lap_s_facto2_sched4_kway_tqrcpend ................***Timeout 353.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.373406e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.723568e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.054158e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.936528e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.189680e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.656194e+00 s Time to initialize coeftab 8.358601e-01 s Time to factorize 3.130819e+01 s (326.57 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Test #3142: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_svdend .......***Timeout 348.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.673727e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.670005e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.320227e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.482245e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.918961e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.250495e+00 s Time to initialize coeftab 1.514996e+00 s Time to factorize 5.411111e+01 s (95.80 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.182624e+00 s Test #3146: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpend ................***Timeout 348.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.837012e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.764364e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.649712e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.802862e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.821195e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.475634e+00 s Time to initialize coeftab 8.623765e+00 s Time to factorize 2.088987e+01 s (248.15 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.400813e+01 s - iteration 1 : total iteration time 4.33 s error 4.9473e-15 Time for refinement 8.315213e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.940631e-15 max(|| b_i - A x_i ||_1) 5.577677e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.008831e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.940631e-15 max(|| b_i - A x_i ||_1) 5.577677e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.008831e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.940631e-15 max(|| b_i - A x_i ||_1) 5.577677e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.008831e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.940631e-15 max(|| b_i - A x_i ||_1) 5.577677e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 7.008831e-03 (SUCCESS) Test #3148: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpend .....***Timeout 348.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.801999e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.394248e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.023282e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.258169e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.694777e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.877201e+00 s Time to initialize coeftab 3.176807e+00 s Time to factorize 3.603859e+01 s (143.84 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.968659e+00 s - iteration 1 : total iteration time 3.62 s error 6.9468e-16 Time for refinement 8.022130e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.093070e-02 max(|| x_i ||_oo) 4.996108e-01 Test #3153: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpbegin ...***Timeout 347.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.107162e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.037008e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.663174e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.143051e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.937631e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.918766e+00 s Time to initialize coeftab 1.691682e+01 s Time to factorize 3.528052e+01 s (146.93 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Time to solve 7.219511e+00 s Test #3156: mpi_dst_example_simple_lap_d_facto0_sched4_not_tqrcpend .................***Timeout 346.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.703302e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.172259e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.807746e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.623351e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.600728e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.517571e+00 s Time to initialize coeftab 3.501442e+00 s Time to factorize 2.680754e+01 s (193.37 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Test #3164: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrrtend ................***Timeout 293.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.291149e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.574759e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.248936e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.482243e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.222024e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.427301e+00 s Time to initialize coeftab 4.445489e-01 s Time to factorize 1.073564e+01 s (482.86 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.123278e+00 s Test #3368: mpi_dst_example_simple_lap_c_facto4_sched4_not_pqrcpend .................***Timeout 261.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.841381e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.488472e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.992508e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.826506e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.271790e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.476167e+00 s Time to initialize coeftab 4.348082e-01 s Start 3368: mpi_dst_example_simple_lap_c_facto4_sched4_not_pqrcpend Start 3510: mpi_dst_example_simple_lap_z_facto3_sched4_kway_tqrcpend Start 3511: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpbegin Start 3512: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpend Start 3513: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrrtbegin Start 3514: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrrtend Start 3515: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrrtbegin Start 3516: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrrtend Start 3517: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtbegin Start 3518: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtend Start 3519: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpilu0 Start 3520: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpilu1 Start 3521: mpi_dst_example_simple_lap_z_facto4_sched4_not_svdbegin Start 3522: mpi_dst_example_simple_lap_z_facto4_sched4_not_svdend Start 3523: mpi_dst_example_simple_lap_z_facto4_sched4_kway_svdbegin Start 3524: mpi_dst_example_simple_lap_z_facto4_sched4_kway_svdend Start 3525: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_svdbegin Start 3526: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_svdend Start 3527: mpi_dst_example_simple_lap_z_facto4_sched4_not_pqrcpbegin Start 3528: mpi_dst_example_simple_lap_z_facto4_sched4_not_pqrcpend Start 3529: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpbegin Start 3530: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpend Start 3531: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpbegin Start 3532: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpend Start 3533: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrcpbegin Start 3534: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrcpend Start 3535: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrcpbegin Start 3536: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrcpend Start 3537: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpbegin Start 3538: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpend Start 3539: mpi_dst_example_simple_lap_z_facto4_sched4_not_tqrcpbegin Start 3540: mpi_dst_example_simple_lap_z_facto4_sched4_not_tqrcpend Start 3541: mpi_dst_example_simple_lap_z_facto4_sched4_kway_tqrcpbegin Start 3542: mpi_dst_example_simple_lap_z_facto4_sched4_kway_tqrcpend Start 3543: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpbegin Start 3544: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpend Start 3545: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrrtbegin Start 3546: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrrtend Start 3547: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrrtbegin Start 3548: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrrtend Start 3549: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtbegin Start 3550: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtend Start 3551: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpilu0 Start 3552: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpilu1 Start 3553: bcsc_shm_test_bcsc_spmv_tests_lap_s Start 3554: bcsc_shm_test_bcsc_spmv_tests_lap_d Start 3555: bcsc_shm_test_bcsc_spmv_tests_lap_c Start 3556: bcsc_shm_test_bcsc_spmv_tests_lap_z Start 3557: bcsc_shm_test_bcsc_spmv_tests_rsa Start 3558: bcsc_shm_test_bcsc_spmv_tests_mm Start 3559: bcsc_shm_test_bcsc_spmv_tests_hb Start 3560: bcsc_shm_test_bcsc_spmv_tests_mm2 Start 3561: bcsc_shm_test_bcsc_spmv_time_lap_s Start 3562: bcsc_shm_test_bcsc_spmv_time_lap_d Start 3563: bcsc_shm_test_bcsc_spmv_time_lap_c Start 3564: bcsc_shm_test_bcsc_spmv_time_lap_z Start 3565: bcsc_shm_test_bcsc_spmv_time_rsa Start 3566: bcsc_shm_test_bcsc_spmv_time_mm Start 3567: bcsc_shm_test_bcsc_spmv_time_hb Start 3568: bcsc_shm_test_bcsc_spmv_time_mm2 Start 3569: bcsc_shm_test_bvec_gemv_tests Start 3570: bcsc_shm_test_bvec_tests Start 3571: bcsc_shm_test_bvec_applyorder_tests Start 3572: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_s Start 3573: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_d Start 3574: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_c Start 3575: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_z Start 3576: bcsc_mpi_rep_test_bcsc_spmv_tests_rsa Start 3577: bcsc_mpi_rep_test_bcsc_spmv_tests_mm Start 3578: bcsc_mpi_rep_test_bcsc_spmv_tests_hb Start 3579: bcsc_mpi_rep_test_bcsc_spmv_tests_mm2 Start 3580: bcsc_mpi_rep_test_bcsc_spmv_time_lap_s Start 3581: bcsc_mpi_rep_test_bcsc_spmv_time_lap_d Start 3582: bcsc_mpi_rep_test_bcsc_spmv_time_lap_c Start 3583: bcsc_mpi_rep_test_bcsc_spmv_time_lap_z Start 3584: bcsc_mpi_rep_test_bcsc_spmv_time_rsa Start 3585: bcsc_mpi_rep_test_bcsc_spmv_time_mm Start 3586: bcsc_mpi_rep_test_bcsc_spmv_time_hb Start 3587: bcsc_mpi_rep_test_bcsc_spmv_time_mm2 Start 3588: bcsc_mpi_rep_test_bvec_gemv_tests Start 3589: bcsc_mpi_rep_test_bvec_tests Start 3590: bcsc_mpi_rep_test_bvec_applyorder_tests Start 3591: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_s Start 3592: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_d Start 3593: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_c Start 3594: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_z Start 3595: bcsc_mpi_dst_test_bcsc_spmv_tests_rsa Start 3596: bcsc_mpi_dst_test_bcsc_spmv_tests_mm Start 3597: bcsc_mpi_dst_test_bcsc_spmv_tests_hb Start 3598: bcsc_mpi_dst_test_bcsc_spmv_tests_mm2 Start 3599: bcsc_mpi_dst_test_bcsc_spmv_time_lap_s Start 3600: bcsc_mpi_dst_test_bcsc_spmv_time_lap_d Start 3601: bcsc_mpi_dst_test_bcsc_spmv_time_lap_c Start 3602: bcsc_mpi_dst_test_bcsc_spmv_time_lap_z Start 3603: bcsc_mpi_dst_test_bcsc_spmv_time_rsa Start 3604: bcsc_mpi_dst_test_bcsc_spmv_time_mm Start 3605: bcsc_mpi_dst_test_bcsc_spmv_time_hb Start 3606: bcsc_mpi_dst_test_bcsc_spmv_time_mm2 Start 3607: bcsc_mpi_dst_test_bvec_tests Start 3608: bcsc_mpi_dst_test_bvec_applyorder_tests Start 3609: fortran_shm_fsimple Start 3610: fortran_mpi_fsimple Start 3611: fortran_shm_flaplacian Start 3612: fortran_mpi_flaplacian Start 3613: fortran_shm_fstep-by-step Start 3614: fortran_mpi_fstep-by-step Start 3615: fortran_shm_fmultidof Start 3616: fortran_mpi_fmultidof Start 3617: fortran_shm_fusermat_csr Start 3618: fortran_mpi_fusermat_csr Start 3619: fortran_shm_fmultilap_seq Start 3620: fortran_shm_fmultilap_mt Start 3621: python_shm_simple Start 3622: python_mpi_simple Start 3623: python_shm_step-by-step Start 3624: python_mpi_step-by-step Start 3625: python_shm_simple_obj Start 3626: python_mpi_simple_obj Test #3073: mpi_dst_example_simple_lap_s_facto1_sched4_not_svdbegin .................***Timeout 621.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.021820e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.831440e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.908582e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.298965e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.890332e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.133524e+00 s Time to initialize coeftab 3.087147e+00 s Test #3103: mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpilu0 ...............***Timeout 620.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.637633e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.493234e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.547512e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.623250e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.345271e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.028261e+00 s Time to initialize coeftab 1.299327e+00 s Time to factorize 9.312019e+01 s (57.55 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Test #3149: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrcpbegin ...............***Timeout 617.72 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.945491e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.245853e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.477105e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.116636e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.961897e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.513452e+00 s Time to initialize coeftab 1.220841e+01 s Time to factorize 5.013303e+01 s (103.40 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Test #3233: mpi_dst_example_simple_lap_c_facto0_sched4_not_svdbegin .................***Timeout 599.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.913120e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.084017e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.082266e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.498653e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.918942e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.678896e+00 s Time to initialize coeftab 8.072608e+00 s Start 3233: mpi_dst_example_simple_lap_c_facto0_sched4_not_svdbegin Test #3235: mpi_dst_example_simple_lap_c_facto0_sched4_kway_svdbegin ................***Timeout 599.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.934798e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.561970e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.391240e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.610290e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.778174e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.114372e+00 s Time to initialize coeftab 6.626777e+00 s Start 3235: mpi_dst_example_simple_lap_c_facto0_sched4_kway_svdbegin Test #3236: mpi_dst_example_simple_lap_c_facto0_sched4_kway_svdend ..................***Timeout 599.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.395325e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.172922e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.489667e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.097247e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.293055e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.922495e+00 s Time to initialize coeftab 1.969032e+00 s Start 3236: mpi_dst_example_simple_lap_c_facto0_sched4_kway_svdend Test #3237: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_svdbegin .....***Timeout 600.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.934526e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.606779e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.274640e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.722427e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.067544e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.700359e+00 s Time to initialize coeftab 6.929135e+00 s Start 3237: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_svdbegin Test #3238: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_svdend .......***Timeout 600.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 1: 300 1140 0: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.482583e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.505397e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.269970e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.582832e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.583981e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.219536e+00 s Time to initialize coeftab 5.554627e+00 s Start 3238: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_svdend Test #3239: mpi_dst_example_simple_lap_c_facto0_sched4_not_pqrcpbegin ...............***Timeout 601.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.874163e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.150813e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.790470e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.089860e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.448486e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.745415e+00 s Time to initialize coeftab 4.333532e+00 s Start 3239: mpi_dst_example_simple_lap_c_facto0_sched4_not_pqrcpbegin Test #3241: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpbegin ..............***Timeout 603.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.430543e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.374029e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.468582e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.099293e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.470154e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.264399e+00 s Time to initialize coeftab 8.156065e+00 s Time to factorize 1.167215e+02 s (177.92 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3241: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpbegin Test #3245: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrcpbegin ...............***Timeout 602.63 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.843877e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.235080e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.330214e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.296307e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.670736e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.700470e+00 s Time to initialize coeftab 1.941283e+01 s Start 3245: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrcpbegin Test #3247: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrcpbegin ..............***Timeout 602.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.528731e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.479666e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.360520e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.393203e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.354757e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.462149e-01 s Time to initialize coeftab 6.541009e+00 s Start 3247: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrcpbegin Test #3251: mpi_dst_example_simple_lap_c_facto0_sched4_not_tqrcpbegin ...............***Timeout 603.11 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.531869e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.975646e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.790177e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.496452e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.113620e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.117523e+00 s Time to initialize coeftab 6.926153e+00 s Start 3251: mpi_dst_example_simple_lap_c_facto0_sched4_not_tqrcpbegin Test #3253: mpi_dst_example_simple_lap_c_facto0_sched4_kway_tqrcpbegin ..............***Timeout 604.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.826047e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.227728e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.596458e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.281062e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.175502e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.375629e+00 s Time to initialize coeftab 6.387733e+00 s Start 3253: mpi_dst_example_simple_lap_c_facto0_sched4_kway_tqrcpbegin Test #3255: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpbegin ...***Timeout 604.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.138890e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.099142e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.502223e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.244736e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.087435e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.240593e+00 s Time to initialize coeftab 8.664293e+00 s Start 3255: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpbegin Test #3257: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrrtbegin ...............***Timeout 604.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.062944e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.593838e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.675772e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.944086e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.100478e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.827444e+00 s Time to initialize coeftab 7.327015e+00 s Start 3257: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrrtbegin Test #3101: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtbegin ...***Timeout 587.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.973155e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.294346e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.684069e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.993363e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.540162e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.558483e+00 s Time to initialize coeftab 4.180817e+00 s Time to factorize 1.223993e+02 s (43.78 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Test #3121: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpbegin ...***Timeout 504.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.538692e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.902399e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.300784e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.878534e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.405164e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.546452e+00 s Time to initialize coeftab 5.265210e+00 s Time to factorize 5.798149e+01 s (176.34 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.2 Ko / 88.6 Ko ------------------------------------------------ Total 112 Ko / 113 Ko Test #3145: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpbegin ..............***Timeout 497.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.693605e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.177450e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.985234e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.987496e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.504263e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.027180e+00 s Time to initialize coeftab 2.759032e+00 s Time to factorize 6.558174e+01 s (79.04 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.035137e+01 s Test #3260: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrrtend ................***Timeout 440.79 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.427564e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.238350e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.303236e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.327846e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.649321e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.272131e+00 s Time to initialize coeftab 4.596296e-01 s Start 3260: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrrtend Test #3263: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpilu0 ...............***Timeout 443.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.919685e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.564292e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.952838e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.986300e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.674547e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.204337e-01 s Time to initialize coeftab 3.255259e-01 s Start 3263: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpilu0 Test #3298: mpi_dst_example_simple_lap_c_facto2_sched4_not_svdend ...................***Timeout 426.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.083491e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.116475e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.577731e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.896876e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.626215e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.466309e+00 s Time to initialize coeftab 6.066709e-01 s Start 3298: mpi_dst_example_simple_lap_c_facto2_sched4_not_svdend Test #3304: mpi_dst_example_simple_lap_c_facto2_sched4_not_pqrcpend .................***Timeout 426.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.669824e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.764283e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.789268e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.404692e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.275271e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.897207e+00 s Time to initialize coeftab 4.528065e-01 s Start 3304: mpi_dst_example_simple_lap_c_facto2_sched4_not_pqrcpend Test #3306: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpend ................***Timeout 427.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.671183e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.821059e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.360953e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.509860e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.416367e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.443399e+00 s Time to initialize coeftab 7.827735e-01 s Start 3306: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpend Test #3307: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpbegin ...***Timeout 427.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.059814e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.040718e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.942651e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.451333e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.729675e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.023585e+00 s Time to initialize coeftab 3.615651e+00 s Start 3307: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpbegin Test #3308: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpend .....***Timeout 428.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.766891e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.324479e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.343331e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.624934e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.883325e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.284963e-01 s Time to initialize coeftab 1.535264e-01 s Start 3308: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpend Test #3309: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrcpbegin ...............***Timeout 429.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.305762e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.654952e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.483236e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.298256e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.107894e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.228515e+00 s Start 3309: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrcpbegin Test #3310: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrcpend .................***Timeout 429.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.659109e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.965245e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.063987e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.665301e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.945455e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.244956e-01 s Time to initialize coeftab 2.399947e-01 s Start 3310: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrcpend Test #3311: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrcpbegin ..............***Timeout 432.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 3: 200 660 2: 200 760 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.146836e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.255760e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.627024e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.911995e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.722048e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.599249e+00 s Time to initialize coeftab 6.779532e+00 s Start 3311: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrcpbegin Test #3312: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrcpend ................***Timeout 432.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch 1: 300 1140 2: 200 760 3: 200 660 Time to compute ordering 4.885360e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.819584e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.157969e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.862341e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.289302e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.574387e+00 s Time to initialize coeftab 7.245562e-01 s Start 3312: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrcpend Test #3313: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpbegin ...***Timeout 432.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.697886e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.324902e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.523929e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.831271e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.269171e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.990449e+00 s Time to initialize coeftab 7.256077e+00 s Start 3313: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpbegin Test #3314: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpend .....***Timeout 433.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.672530e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.654911e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.664302e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.254995e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.504136e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.942099e+00 s Start 3314: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpend Test #3315: mpi_dst_example_simple_lap_c_facto2_sched4_not_tqrcpbegin ...............***Timeout 434.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.987114e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.012370e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.088854e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.138696e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.542043e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.942907e+00 s Time to initialize coeftab 6.632560e+00 s Start 3315: mpi_dst_example_simple_lap_c_facto2_sched4_not_tqrcpbegin Test #3316: mpi_dst_example_simple_lap_c_facto2_sched4_not_tqrcpend .................***Timeout 437.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.843493e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.077877e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.207042e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.015820e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.163861e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.396717e+00 s Time to initialize coeftab 8.940759e-01 s Start 3316: mpi_dst_example_simple_lap_c_facto2_sched4_not_tqrcpend Test #3317: mpi_dst_example_simple_lap_c_facto2_sched4_kway_tqrcpbegin ..............***Timeout 438.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch 1: 300 1140 2: 200 760 3: 200 660 Time to compute ordering 4.736221e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.029350e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.307148e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.798940e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.067837e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.902672e+00 s Time to initialize coeftab 6.449887e+00 s Start 3317: mpi_dst_example_simple_lap_c_facto2_sched4_kway_tqrcpbegin Test #3318: mpi_dst_example_simple_lap_c_facto2_sched4_kway_tqrcpend ................***Timeout 439.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.936499e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.794576e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.831356e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.625170e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.535836e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.158178e+00 s Time to initialize coeftab 7.062281e-01 s Start 3318: mpi_dst_example_simple_lap_c_facto2_sched4_kway_tqrcpend Test #3319: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpbegin ...***Timeout 439.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.339052e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.981394e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.015921e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.345653e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.910042e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.128165e-01 s Time to initialize coeftab 7.129331e+00 s Start 3319: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpbegin Test #3320: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpend .....***Timeout 438.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.682609e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.835275e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.627701e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.838026e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.399857e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.704475e+00 s Time to initialize coeftab 9.403614e-01 s Start 3320: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpend Test #3321: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrrtbegin ...............***Timeout 438.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.212800e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.708947e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.273426e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.219071e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.929321e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.597220e+00 s Start 3321: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrrtbegin Test #3322: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrrtend .................***Timeout 438.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.786465e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.029842e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.733873e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.001420e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.169162e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.608306e+00 s Time to initialize coeftab 4.646990e-01 s Start 3322: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrrtend Test #3323: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrrtbegin ..............***Timeout 438.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.498164e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.490134e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.697386e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.788481e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.360233e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.189500e+00 s Start 3323: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrrtbegin Test #3324: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrrtend ................***Timeout 438.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.815802e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.408458e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.700597e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.697487e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.122274e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.219357e+00 s Time to initialize coeftab 1.248970e-01 s Start 3324: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrrtend Test #3327: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpilu0 ...............***Timeout 425.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.541240e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.295600e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.069681e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.745569e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.436978e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.070272e+00 s Time to initialize coeftab 1.309148e+00 s Start 3327: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpilu0 Test #3328: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpilu1 ...............***Timeout 425.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.534713e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.719545e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.955606e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.888309e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.948090e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.372793e+00 s Time to initialize coeftab 3.470187e-01 s Start 3328: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpilu1 Test #3330: mpi_dst_example_simple_lap_c_facto3_sched4_not_svdend ...................***Timeout 425.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.445585e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.427324e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.125714e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.314149e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.814518e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.012669e+00 s Time to initialize coeftab 6.160221e-01 s Start 3330: mpi_dst_example_simple_lap_c_facto3_sched4_not_svdend Test #3334: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_svdend .......***Timeout 424.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.575341e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.513154e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.078713e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.920376e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.953468e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.975421e+00 s Time to initialize coeftab 4.006546e-01 s Start 3334: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_svdend Test #3335: mpi_dst_example_simple_lap_c_facto3_sched4_not_pqrcpbegin ...............***Timeout 424.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch 1: 300 1140 2: 200 760 3: 200 660 Time to compute ordering 2.899976e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.759591e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.290964e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.784230e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.698754e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.347140e+00 s Start 3335: mpi_dst_example_simple_lap_c_facto3_sched4_not_pqrcpbegin Test #3336: mpi_dst_example_simple_lap_c_facto3_sched4_not_pqrcpend .................***Timeout 423.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.471663e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.103857e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.898563e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.836484e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.800770e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.442913e+00 s Time to initialize coeftab 3.226219e-01 s Start 3336: mpi_dst_example_simple_lap_c_facto3_sched4_not_pqrcpend Test #3337: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpbegin ..............***Timeout 424.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.947070e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.767449e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.248298e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.984411e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.476025e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.382171e+00 s Time to initialize coeftab 1.805720e+00 s Start 3337: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpbegin Test #3338: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpend ................***Timeout 425.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.722028e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.215244e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.761268e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.055560e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.140091e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.055703e+00 s Time to initialize coeftab 5.370914e-01 s Start 3338: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpend Test #3339: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpbegin ...***Timeout 426.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.496045e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.034168e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.268121e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.639615e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.080918e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.865572e+00 s Time to initialize coeftab 1.950122e+00 s Start 3339: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpbegin Test #3340: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpend .....***Timeout 427.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.188523e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.969986e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.490443e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.354347e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.507043e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.364725e-01 s Time to initialize coeftab 3.748292e-01 s Start 3340: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpend Test #3341: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrcpbegin ...............***Timeout 427.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.883071e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.230646e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.600618e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.120339e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.434618e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.965852e+00 s Time to initialize coeftab 6.375890e+00 s Start 3341: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrcpbegin Test #3342: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrcpend .................***Timeout 427.97 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.491124e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.750142e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.598990e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.678496e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.823282e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.151708e-01 s Time to initialize coeftab 2.537228e-01 s Start 3342: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrcpend Test #3344: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrcpend ................***Timeout 427.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.359703e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.628207e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.623608e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.020001e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.725229e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.292963e-01 s Time to initialize coeftab 3.396344e-01 s Start 3344: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrcpend Test #3346: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpend .....***Timeout 417.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.592814e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.498713e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.710026e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.176582e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.948704e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.768407e+00 s Time to initialize coeftab 2.703577e-01 s Start 3346: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpend Test #3347: mpi_dst_example_simple_lap_c_facto3_sched4_not_tqrcpbegin ...............***Timeout 417.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.772985e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.004255e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.594713e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.743093e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.175145e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.773989e+00 s Time to initialize coeftab 3.310108e+00 s Start 3347: mpi_dst_example_simple_lap_c_facto3_sched4_not_tqrcpbegin Test #3348: mpi_dst_example_simple_lap_c_facto3_sched4_not_tqrcpend .................***Timeout 418.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.590208e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.884442e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.912984e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.777767e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.207840e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.858390e+00 s Time to initialize coeftab 1.225211e+00 s Start 3348: mpi_dst_example_simple_lap_c_facto3_sched4_not_tqrcpend Test #3350: mpi_dst_example_simple_lap_c_facto3_sched4_kway_tqrcpend ................***Timeout 418.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.395355e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.152997e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.382640e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.971100e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.561705e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.420719e-01 s Time to initialize coeftab 8.741747e-02 s Start 3350: mpi_dst_example_simple_lap_c_facto3_sched4_kway_tqrcpend Test #3352: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpend .....***Timeout 417.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.160806e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.512187e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.355979e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.583485e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.780843e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.307512e+00 s Time to initialize coeftab 1.900104e-01 s Start 3352: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpend Test #3353: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrrtbegin ...............***Timeout 417.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.212255e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.974148e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.501158e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.754858e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.616776e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.336996e+00 s Time to initialize coeftab 5.897306e+00 s Start 3353: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrrtbegin Test #3354: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrrtend .................***Timeout 417.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.704316e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.160119e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.268804e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.020651e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.141273e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.402843e+00 s Time to initialize coeftab 1.376005e+00 s Start 3354: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrrtend Test #3355: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrrtbegin ..............***Timeout 417.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.473599e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.693560e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.845281e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.117496e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.017148e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.838944e+00 s Start 3355: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrrtbegin Test #3356: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrrtend ................***Timeout 418.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.356594e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.049934e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.418869e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.812226e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.338087e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Start 3356: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrrtend Test #3357: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtbegin ...***Timeout 419.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.065390e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.720654e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.887992e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.914423e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.535556e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.926457e+00 s Time to initialize coeftab 4.030834e+00 s Start 3357: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtbegin Test #3358: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtend .....***Timeout 419.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.419878e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.835672e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.551985e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.173417e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.915611e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.106443e+00 s Time to initialize coeftab 8.739649e-01 s Start 3358: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtend Test #3359: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpilu0 ...............***Timeout 420.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.833800e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.655607e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.307415e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.632115e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.604594e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.959355e+00 s Time to initialize coeftab 1.461994e+00 s Start 3359: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpilu0 Test #3360: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpilu1 ...............***Timeout 420.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.686795e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.075260e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.880572e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.654100e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.225527e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.066669e+00 s Time to initialize coeftab 6.599361e-01 s Start 3360: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpilu1 Test #3364: mpi_dst_example_simple_lap_c_facto4_sched4_kway_svdend ..................***Timeout 417.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.041437e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.403492e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.393427e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.091400e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.709710e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.423295e+00 s Time to initialize coeftab 1.426994e+00 s Start 3364: mpi_dst_example_simple_lap_c_facto4_sched4_kway_svdend Test #3366: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_svdend .......***Timeout 417.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.963518e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.968796e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.826770e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.239694e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.393142e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.100420e+00 s Time to initialize coeftab 1.369046e+00 s Start 3366: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_svdend Test #3370: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpend ................***Timeout 417.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.795123e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.386540e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.219064e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.154307e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.294015e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.707169e+00 s Time to initialize coeftab 6.178053e-01 s Start 3370: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpend Test #3372: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpend .....***Timeout 416.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.716474e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.719525e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.315483e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.974105e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.044192e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.516716e+00 s Time to initialize coeftab 1.152908e+00 s Start 3372: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpend Test #3376: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrcpend ................***Timeout 417.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.912556e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.707795e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.381460e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.155508e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.339929e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.105929e+00 s Time to initialize coeftab 9.428167e-01 s Start 3376: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrcpend Test #3378: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpend .....***Timeout 418.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.262361e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.173398e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.367178e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.886522e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.710852e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.388170e+00 s Time to initialize coeftab 1.983725e+00 s Start 3378: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpend Test #3380: mpi_dst_example_simple_lap_c_facto4_sched4_not_tqrcpend .................***Timeout 420.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.063117e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.295930e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.010984e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.533038e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.854831e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.887696e+00 s Time to initialize coeftab 1.835753e+00 s Start 3380: mpi_dst_example_simple_lap_c_facto4_sched4_not_tqrcpend Test #3382: mpi_dst_example_simple_lap_c_facto4_sched4_kway_tqrcpend ................***Timeout 421.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.793014e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.424713e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.315498e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.672069e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.216592e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.415311e+00 s Time to initialize coeftab 5.455077e-01 s Start 3382: mpi_dst_example_simple_lap_c_facto4_sched4_kway_tqrcpend Test #3384: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpend .....***Timeout 421.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.793165e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.400932e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.441062e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.452105e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.252318e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.081890e+00 s Time to initialize coeftab 5.388874e-01 s Start 3384: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpend Test #3386: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrrtend .................***Timeout 421.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.198346e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.057045e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.466761e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.514676e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.595992e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.428012e+00 s Time to initialize coeftab 1.313714e+00 s Start 3386: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrrtend Test #3388: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrrtend ................***Timeout 421.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.808924e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.273014e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.109423e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.649765e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.260492e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.472204e+00 s Time to initialize coeftab 1.744414e+00 s Start 3388: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrrtend Test #3390: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtend .....***Timeout 421.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.673869e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.332050e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.876826e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.985520e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.938571e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.123007e+00 s Time to initialize coeftab 1.362694e+00 s Start 3390: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtend Test #3391: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpilu0 ...............***Timeout 421.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.527734e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.063465e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.214658e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.674172e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.409804e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Start 3391: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpilu0 Test #3398: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_svdend .......***Timeout 406.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.366086e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.832001e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.419200e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.212337e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.818386e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.244949e+00 s Time to initialize coeftab 3.446792e+00 s Start 3398: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_svdend Test #3400: mpi_dst_example_simple_lap_z_facto0_sched4_not_pqrcpend .................***Timeout 406.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.338845e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.817265e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.388882e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.090498e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.375844e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Start 3400: mpi_dst_example_simple_lap_z_facto0_sched4_not_pqrcpend Test #3402: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpend ................***Timeout 407.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.280353e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.812818e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.013301e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.892605e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.704970e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.726017e+00 s Time to initialize coeftab 3.075018e+00 s Start 3402: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpend Test #3404: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpend .....***Timeout 406.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.385172e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.226542e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.087928e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.769280e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.908281e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.260941e+00 s Time to initialize coeftab 2.086809e+00 s Start 3404: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpend Test #3406: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrcpend .................***Timeout 408.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.342920e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.240276e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.780716e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.464164e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.989505e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.371100e+00 s Time to initialize coeftab 1.730039e+00 s Start 3406: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrcpend Test #3408: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrcpend ................***Timeout 411.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.322812e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.237929e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.043124e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.034261e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.700775e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.820596e+00 s Time to initialize coeftab 8.880151e-01 s Start 3408: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrcpend Test #3410: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpend .....***Timeout 412.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.580486e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.209425e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.395837e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.098106e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.749544e+01 s Start 3410: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpend 3144/3626 Test #3442: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpend .....***Timeout 395.44 sec Start 3442: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpend 3144/3626 Test #3443: mpi_dst_example_simple_lap_z_facto1_sched4_not_tqrcpbegin ...............***Timeout 395.47 sec Start 3443: mpi_dst_example_simple_lap_z_facto1_sched4_not_tqrcpbegin 3144/3626 Test #3444: mpi_dst_example_simple_lap_z_facto1_sched4_not_tqrcpend .................***Timeout 395.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3444: mpi_dst_example_simple_lap_z_facto1_sched4_not_tqrcpend 3144/3626 Test #3445: mpi_dst_example_simple_lap_z_facto1_sched4_kway_tqrcpbegin ..............***Timeout 397.09 sec Start 3445: mpi_dst_example_simple_lap_z_facto1_sched4_kway_tqrcpbegin 3144/3626 Test #3446: mpi_dst_example_simple_lap_z_facto1_sched4_kway_tqrcpend ................***Timeout 397.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3446: mpi_dst_example_simple_lap_z_facto1_sched4_kway_tqrcpend 3144/3626 Test #3447: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpbegin ...***Timeout 397.09 sec Start 3447: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpbegin 3144/3626 Test #3448: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpend .....***Timeout 397.09 sec Start 3448: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpend 3144/3626 Test #3449: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrrtbegin ...............***Timeout 399.14 sec Start 3449: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrrtbegin 3144/3626 Test #3450: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrrtend .................***Timeout 399.49 sec Start 3450: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrrtend 3144/3626 Test #3451: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrrtbegin ..............***Timeout 402.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3451: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrrtbegin 3144/3626 Test #3452: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrrtend ................***Timeout 403.15 sec Start 3452: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrrtend 3144/3626 Test #3453: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtbegin ...***Timeout 403.03 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT ischedInit: The thread number has been automatically set to 256 Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3453: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtbegin 3144/3626 Test #3454: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtend .....***Timeout 403.72 sec Start 3454: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtend 3144/3626 Test #3455: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpilu0 ...............***Timeout 403.68 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3455: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpilu0 3144/3626 Test #3456: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpilu1 ...............***Timeout 403.96 sec Start 3456: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpilu1 3144/3626 Test #3457: mpi_dst_example_simple_lap_z_facto2_sched4_not_svdbegin .................***Timeout 405.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3457: mpi_dst_example_simple_lap_z_facto2_sched4_not_svdbegin 3144/3626 Test #3458: mpi_dst_example_simple_lap_z_facto2_sched4_not_svdend ...................***Timeout 405.51 sec Start 3458: mpi_dst_example_simple_lap_z_facto2_sched4_not_svdend 3144/3626 Test #3459: mpi_dst_example_simple_lap_z_facto2_sched4_kway_svdbegin ................***Timeout 405.40 sec Start 3459: mpi_dst_example_simple_lap_z_facto2_sched4_kway_svdbegin 3144/3626 Test #3460: mpi_dst_example_simple_lap_z_facto2_sched4_kway_svdend ..................***Timeout 405.41 sec ischedInit: The thread number has been automatically set to 256 Start 3460: mpi_dst_example_simple_lap_z_facto2_sched4_kway_svdend 3144/3626 Test #3461: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_svdbegin .....***Timeout 405.29 sec Start 3461: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_svdbegin 3144/3626 Test #3462: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_svdend .......***Timeout 405.29 sec Start 3462: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_svdend 3144/3626 Test #3463: mpi_dst_example_simple_lap_z_facto2_sched4_not_pqrcpbegin ...............***Timeout 405.16 sec Start 3463: mpi_dst_example_simple_lap_z_facto2_sched4_not_pqrcpbegin 3144/3626 Test #3464: mpi_dst_example_simple_lap_z_facto2_sched4_not_pqrcpend .................***Timeout 405.16 sec Start 3464: mpi_dst_example_simple_lap_z_facto2_sched4_not_pqrcpend 3144/3626 Test #3465: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpbegin ..............***Timeout 405.15 sec Start 3465: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpbegin 3144/3626 Test #3466: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpend ................***Timeout 405.15 sec Start 3466: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpend 3144/3626 Test #3467: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpbegin ...***Timeout 405.16 sec Start 3467: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpbegin 3144/3626 Test #3468: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpend .....***Timeout 405.17 sec Start 3468: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpend 3144/3626 Test #3469: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrcpbegin ...............***Timeout 404.78 sec Start 3469: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrcpbegin 3144/3626 Test #3470: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrcpend .................***Timeout 404.82 sec Start 3470: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrcpend 3144/3626 Test #3471: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrcpbegin ..............***Timeout 404.84 sec Start 3471: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrcpbegin 3144/3626 Test #3472: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrcpend ................***Timeout 404.87 sec Start 3472: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrcpend 3144/3626 Test #3473: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpbegin ...***Timeout 404.84 sec Start 3473: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpbegin 3144/3626 Test #3474: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpend .....***Timeout 404.81 sec Start 3474: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpend 3144/3626 Test #3475: mpi_dst_example_simple_lap_z_facto2_sched4_not_tqrcpbegin ...............***Timeout 404.82 sec Start 3475: mpi_dst_example_simple_lap_z_facto2_sched4_not_tqrcpbegin 3144/3626 Test #3476: mpi_dst_example_simple_lap_z_facto2_sched4_not_tqrcpend .................***Timeout 404.84 sec Start 3476: mpi_dst_example_simple_lap_z_facto2_sched4_not_tqrcpend 3144/3626 Test #3477: mpi_dst_example_simple_lap_z_facto2_sched4_kway_tqrcpbegin ..............***Timeout 404.65 sec Start 3477: mpi_dst_example_simple_lap_z_facto2_sched4_kway_tqrcpbegin 3144/3626 Test #3478: mpi_dst_example_simple_lap_z_facto2_sched4_kway_tqrcpend ................***Timeout 404.67 sec Start 3478: mpi_dst_example_simple_lap_z_facto2_sched4_kway_tqrcpend 3144/3626 Test #3479: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpbegin ...***Timeout 404.70 sec Start 3479: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpbegin 3144/3626 Test #3480: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpend .....***Timeout 404.75 sec Start 3480: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpend 3144/3626 Test #3481: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrrtbegin ...............***Timeout 404.78 sec Start 3481: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrrtbegin 3144/3626 Test #3482: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrrtend .................***Timeout 404.81 sec Start 3482: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrrtend 3144/3626 Test #3483: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrrtbegin ..............***Timeout 404.77 sec Start 3483: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrrtbegin 3144/3626 Test #3484: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrrtend ................***Timeout 405.35 sec Start 3484: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrrtend 3144/3626 Test #3485: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtbegin ...***Timeout 407.50 sec Start 3485: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtbegin 3144/3626 Test #3486: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtend .....***Timeout 409.31 sec Start 3486: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtend 3144/3626 Test #3487: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpilu0 ...............***Timeout 410.95 sec Start 3487: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpilu0 3144/3626 Test #3488: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpilu1 ...............***Timeout 410.92 sec Start 3488: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpilu1 3144/3626 Test #3489: mpi_dst_example_simple_lap_z_facto3_sched4_not_svdbegin .................***Timeout 410.92 sec Start 3489: mpi_dst_example_simple_lap_z_facto3_sched4_not_svdbegin 3144/3626 Test #3490: mpi_dst_example_simple_lap_z_facto3_sched4_not_svdend ...................***Timeout 410.52 sec Start 3490: mpi_dst_example_simple_lap_z_facto3_sched4_not_svdend 3144/3626 Test #3491: mpi_dst_example_simple_lap_z_facto3_sched4_kway_svdbegin ................***Timeout 410.02 sec Start 3491: mpi_dst_example_simple_lap_z_facto3_sched4_kway_svdbegin 3144/3626 Test #3492: mpi_dst_example_simple_lap_z_facto3_sched4_kway_svdend ..................***Timeout 409.69 sec Start 3492: mpi_dst_example_simple_lap_z_facto3_sched4_kway_svdend 3144/3626 Test #3493: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_svdbegin .....***Timeout 409.25 sec Start 3493: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_svdbegin 3144/3626 Test #3494: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_svdend .......***Timeout 408.67 sec Start 3494: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_svdend 3144/3626 Test #3495: mpi_dst_example_simple_lap_z_facto3_sched4_not_pqrcpbegin ...............***Timeout 407.96 sec Start 3495: mpi_dst_example_simple_lap_z_facto3_sched4_not_pqrcpbegin 3144/3626 Test #3496: mpi_dst_example_simple_lap_z_facto3_sched4_not_pqrcpend .................***Timeout 407.71 sec Start 3496: mpi_dst_example_simple_lap_z_facto3_sched4_not_pqrcpend 3144/3626 Test #3497: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpbegin ..............***Timeout 407.48 sec Start 3497: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpbegin 3144/3626 Test #3498: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpend ................***Timeout 407.24 sec Start 3498: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpend 3144/3626 Test #3499: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpbegin ...***Timeout 406.78 sec Start 3499: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpbegin 3144/3626 Test #3500: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpend .....***Timeout 406.08 sec Start 3500: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpend 3144/3626 Test #3501: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrcpbegin ...............***Timeout 405.63 sec Start 3501: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrcpbegin 3144/3626 Test #3502: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrcpend .................***Timeout 404.84 sec Start 3502: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrcpend 3144/3626 Test #3503: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrcpbegin ..............***Timeout 403.58 sec Start 3503: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrcpbegin 3144/3626 Test #3504: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrcpend ................***Timeout 404.37 sec Start 3504: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrcpend 3144/3626 Test #3505: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpbegin ...***Timeout 404.37 sec Start 3505: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpbegin 3144/3626 Test #3506: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpend .....***Timeout 404.37 sec Start 3506: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpend 3144/3626 Test #3507: mpi_dst_example_simple_lap_z_facto3_sched4_not_tqrcpbegin ...............***Timeout 403.90 sec Start 3507: mpi_dst_example_simple_lap_z_facto3_sched4_not_tqrcpbegin 3144/3626 Test #3508: mpi_dst_example_simple_lap_z_facto3_sched4_not_tqrcpend .................***Timeout 403.39 sec Start 3508: mpi_dst_example_simple_lap_z_facto3_sched4_not_tqrcpend 3144/3626 Test #3509: mpi_dst_example_simple_lap_z_facto3_sched4_kway_tqrcpbegin ..............***Timeout 404.00 sec Start 3509: mpi_dst_example_simple_lap_z_facto3_sched4_kway_tqrcpbegin Test #3078: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_svdend .......***Timeout 474.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.152448e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.726066e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.950487e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.268798e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.199523e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.764426e+00 s Time to initialize coeftab 4.087879e-01 s Time to factorize 1.888600e+01 s (283.76 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Time to solve 1.012344e+01 s Time for refinement 9.489295e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.093077e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.741714e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.741714e-07 max(|| b_i - A x_i ||_1) 7.533756e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.466845e-01 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.741714e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.741714e-07 max(|| b_i - A x_i ||_1) 7.533756e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.466845e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.533756e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.466845e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 7.533756e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 9.466845e-01 (SUCCESS) Test #3085: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrcpbegin ...............***Timeout 474.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.980544e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.270166e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.166024e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.036353e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.756082e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.633254e+00 s Time to initialize coeftab 2.122456e+00 s Time to factorize 1.063535e+02 s (50.39 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Test #3098: mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrrtend .................***Timeout 475.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.913284e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.332325e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.311836e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.650509e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.841829e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.305055e+00 s Time to initialize coeftab 4.969035e-01 s Time to factorize 3.228099e+01 s (166.01 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko ------------------------------------------------ Total 68.5 Ko / 68.5 Ko Test #3110: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_svdend .......***Timeout 475.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.095757e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.542129e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.231572e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.760662e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.198117e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.609741e+00 s Time to initialize coeftab 4.505703e+00 s Time to factorize 1.762823e+01 s (579.99 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Test #3113: mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpbegin ..............***Timeout 475.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.481006e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.708738e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.564839e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.506029e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.012803e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.643379e+00 s Time to initialize coeftab 2.063829e+00 s Time to factorize 5.137099e+01 s (199.03 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Test #3123: mpi_dst_example_simple_lap_s_facto2_sched4_not_tqrcpbegin ...............***Timeout 476.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.023937e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.233856e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.298265e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.733146e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.122263e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.646007e+00 s Time to initialize coeftab 2.611477e+00 s Time to factorize 8.096219e+01 s (126.28 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 112 Ko / 113 Ko Test #3129: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrrtbegin ...............***Timeout 476.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.113863e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.135154e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.929421e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.501586e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.226947e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.858588e+00 s Time to initialize coeftab 1.769335e+00 s Time to factorize 2.865627e+01 s (356.79 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Test #3130: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrrtend .................***Timeout 476.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.001054e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.635305e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.882567e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.900346e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.791518e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.523545e+00 s Time to initialize coeftab 2.288920e+00 s Time to factorize 3.783429e+01 s (270.24 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 113 Ko / 113 Ko Test #3140: mpi_dst_example_simple_lap_d_facto0_sched4_kway_svdend ..................***Timeout 479.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.110850e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.396389e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.891696e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.331564e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.063905e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.122247e+00 s Time to initialize coeftab 1.801965e+00 s Time to factorize 9.867291e+01 s (52.54 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Test #3151: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrcpbegin ..............***Timeout 479.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.868545e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.413680e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.490763e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.919156e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.396057e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.234240e+00 s Time to initialize coeftab 3.569993e+00 s Time to factorize 8.559198e+01 s (60.56 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 136 Ko / 137 Ko Test #3154: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpend .....***Timeout 478.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.156242e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.565240e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.753082e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.854447e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.276046e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.307183e+01 s Time to initialize coeftab 2.065998e+00 s Time to factorize 4.149299e+01 s (124.93 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3161: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrrtbegin ...............***Timeout 474.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.903025e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.546832e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.473641e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.474279e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.825696e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.785520e+00 s Time to initialize coeftab 2.681344e+00 s Time to factorize 3.532714e+01 s (146.74 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3162: mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrrtend .................***Timeout 474.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.296573e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.936823e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.141514e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.476159e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.889602e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.010285e+00 s Time to initialize coeftab 1.548268e+00 s Time to factorize 7.657048e+01 s (67.70 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Test #3186: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpend .....***Timeout 474.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.157422e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.140701e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.790871e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.921913e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.180833e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.768536e+00 s Time to initialize coeftab 7.783429e-01 s Time to factorize 4.000984e+01 s (133.94 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3258: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrrtend .................***Timeout 474.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.625072e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.332895e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.930596e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.449850e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.814857e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.700470e+00 s Time to initialize coeftab 8.513442e-01 s Start 3258: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrrtend Test #3170: mpi_dst_example_simple_lap_d_facto1_sched4_not_svdend ...................***Timeout 482.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.611875e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.019222e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.842875e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.605427e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.518072e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.990266e+00 s Time to initialize coeftab 5.459956e-01 s Time to factorize 6.345914e+01 s (84.45 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3176: mpi_dst_example_simple_lap_d_facto1_sched4_not_pqrcpend .................***Timeout 480.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.159816e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.268794e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.110731e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.032404e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.316695e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.842331e+00 s Time to initialize coeftab 4.144521e-01 s Time to factorize 1.284291e+01 s (417.28 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3178: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpend ................***Timeout 480.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.447681e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.913174e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.278707e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.159018e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.053942e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.508408e+00 s Time to initialize coeftab 5.670199e-01 s Time to factorize 4.403995e+01 s (121.69 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3180: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpend .....***Timeout 480.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.901982e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.126072e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.599512e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.374744e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.116938e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.733520e+00 s Time to initialize coeftab 2.136828e+00 s Time to factorize 4.498470e+01 s (119.13 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3182: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrcpend .................***Timeout 481.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.249674e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.857440e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.733417e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.330611e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.575356e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.761002e+00 s Time to initialize coeftab 1.766489e+00 s Time to factorize 5.004563e+01 s (107.08 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Test #3272: mpi_dst_example_simple_lap_c_facto1_sched4_not_pqrcpend .................***Timeout 487.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.600603e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.817503e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.566409e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.288494e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.448543e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.202511e+00 s Time to initialize coeftab 1.005083e+00 s Time to factorize 6.499513e+01 s (335.71 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Start 3272: mpi_dst_example_simple_lap_c_facto1_sched4_not_pqrcpend Test #3274: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpend ................***Timeout 488.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.166495e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.083770e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.199734e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.091591e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.542877e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.688788e+00 s Time to initialize coeftab 1.926982e+00 s Time to factorize 8.780403e+01 s (248.50 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3274: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpend Test #3284: mpi_dst_example_simple_lap_c_facto1_sched4_not_tqrcpend .................***Timeout 490.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.978355e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.318386e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.350750e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.013335e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.554748e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.007365e+00 s Time to initialize coeftab 6.620771e-01 s Time to factorize 1.024515e+02 s (212.97 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3284: mpi_dst_example_simple_lap_c_facto1_sched4_not_tqrcpend Test #3210: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpend ................***Timeout 501.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.720596e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.120992e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.058717e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.319174e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.721335e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.726772e+00 s Time to initialize coeftab 7.300519e-01 s Time to factorize 4.215428e+01 s (242.54 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Test #3432: mpi_dst_example_simple_lap_z_facto1_sched4_not_pqrcpend .................***Timeout 500.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.877746e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.563916e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.953785e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.455584e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.629278e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.606498e+00 s Time to initialize coeftab 1.189996e+00 s Time to factorize 6.217901e+01 s (350.91 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 3432: mpi_dst_example_simple_lap_z_facto1_sched4_not_pqrcpend Test #3434: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpend ................***Timeout 499.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.630827e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.531738e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.845854e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.473172e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.155638e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.486059e+00 s Time to initialize coeftab 3.824257e+00 s Time to factorize 6.488872e+01 s (336.26 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Start 3434: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpend Test #3438: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrcpend .................***Timeout 500.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.414312e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.719889e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.852591e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.895818e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.064538e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.506667e-02 s Time to initialize coeftab 8.838830e-01 s Time to factorize 1.017127e+02 s (214.52 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko Start 3438: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrcpend Test #3198: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtend .....***Timeout 471.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.859890e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.574284e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.288957e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.992515e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.581089e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.849346e+00 s Time to initialize coeftab 2.306374e+00 s Test #3204: mpi_dst_example_simple_lap_d_facto2_sched4_kway_svdend ..................***Timeout 469.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.009689e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.216599e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.263518e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.061721e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.663442e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.448720e+00 s Time to initialize coeftab 2.664538e+00 s Time to factorize 6.125603e+01 s (166.91 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3208: mpi_dst_example_simple_lap_d_facto2_sched4_not_pqrcpend .................***Timeout 469.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.879796e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.313665e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.335297e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.690197e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.041958e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.766806e+00 s Time to initialize coeftab 1.935724e+00 s Time to factorize 3.299403e+01 s (309.88 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Test #3209: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpbegin ..............***Timeout 468.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.161827e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.419287e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.486325e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.919243e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.798255e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.600154e-01 s Time to initialize coeftab 2.105370e+00 s Time to factorize 1.020715e+02 s (100.17 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3212: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpend .....***Timeout 468.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.106275e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.305987e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.048485e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.477299e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.043472e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.160188e+00 s Time to initialize coeftab 1.277006e+00 s Time to factorize 5.535273e+01 s (184.71 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko Test #3226: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrrtend .................***Timeout 469.82 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.707693e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.310276e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.238798e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.579397e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.612191e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.115701e+01 s Time to initialize coeftab 1.831925e+00 s Time to factorize 4.935290e+01 s (207.17 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3228: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrrtend ................***Timeout 469.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.682771e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.552802e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.069585e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.189851e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.913363e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.293484e+00 s Time to initialize coeftab 1.854987e+00 s Time to factorize 5.047011e+01 s (202.58 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3232: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpilu1 ...............***Timeout 470.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.955525e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.262471e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.887642e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.005190e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.003976e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.720300e+00 s Time to initialize coeftab 1.722172e+00 s Time to factorize 7.227627e+01 s (141.46 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko 3172/3626 Test #3553: bcsc_shm_test_bcsc_spmv_tests_lap_s .....................................***Timeout 340.54 sec ischedInit: The thread number has been automatically set to 256 Start 3553: bcsc_shm_test_bcsc_spmv_tests_lap_s 3172/3626 Test #3554: bcsc_shm_test_bcsc_spmv_tests_lap_d .....................................***Timeout 340.06 sec ischedInit: The thread number has been automatically set to 256 Start 3554: bcsc_shm_test_bcsc_spmv_tests_lap_d 3172/3626 Test #3555: bcsc_shm_test_bcsc_spmv_tests_lap_c .....................................***Timeout 340.06 sec ischedInit: The thread number has been automatically set to 256 Start 3555: bcsc_shm_test_bcsc_spmv_tests_lap_c 3172/3626 Test #3556: bcsc_shm_test_bcsc_spmv_tests_lap_z .....................................***Timeout 340.06 sec ischedInit: The thread number has been automatically set to 256 Start 3556: bcsc_shm_test_bcsc_spmv_tests_lap_z 3172/3626 Test #3557: bcsc_shm_test_bcsc_spmv_tests_rsa .......................................***Timeout 339.91 sec RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 Start 3557: bcsc_shm_test_bcsc_spmv_tests_rsa 3172/3626 Test #3558: bcsc_shm_test_bcsc_spmv_tests_mm ........................................***Timeout 339.90 sec ischedInit: The thread number has been automatically set to 256 Start 3558: bcsc_shm_test_bcsc_spmv_tests_mm 3172/3626 Test #3559: bcsc_shm_test_bcsc_spmv_tests_hb ........................................***Timeout 339.89 sec ischedInit: The thread number has been automatically set to 256 Start 3559: bcsc_shm_test_bcsc_spmv_tests_hb 3172/3626 Test #3560: bcsc_shm_test_bcsc_spmv_tests_mm2 .......................................***Timeout 338.31 sec ischedInit: The thread number has been automatically set to 256 Start 3560: bcsc_shm_test_bcsc_spmv_tests_mm2 3172/3626 Test #3561: bcsc_shm_test_bcsc_spmv_time_lap_s ......................................***Timeout 338.30 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 3561: bcsc_shm_test_bcsc_spmv_time_lap_s 3172/3626 Test #3562: bcsc_shm_test_bcsc_spmv_time_lap_d ......................................***Timeout 338.28 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 3562: bcsc_shm_test_bcsc_spmv_time_lap_d 3172/3626 Test #3563: bcsc_shm_test_bcsc_spmv_time_lap_c ......................................***Timeout 338.24 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 3563: bcsc_shm_test_bcsc_spmv_time_lap_c 3172/3626 Test #3564: bcsc_shm_test_bcsc_spmv_time_lap_z ......................................***Timeout 338.24 sec ischedInit: The thread number has been automatically set to 256 Start 3564: bcsc_shm_test_bcsc_spmv_time_lap_z 3172/3626 Test #3566: bcsc_shm_test_bcsc_spmv_time_mm .........................................***Timeout 338.35 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Complex64 Format: IJV N: 841 nnz: 2465 Start 3566: bcsc_shm_test_bcsc_spmv_time_mm 3172/3626 Test #3567: bcsc_shm_test_bcsc_spmv_time_hb .........................................***Timeout 338.35 sec ischedInit: The thread number has been automatically set to 256 Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 Start 3567: bcsc_shm_test_bcsc_spmv_time_hb 3172/3626 Test #3568: bcsc_shm_test_bcsc_spmv_time_mm2 ........................................***Timeout 338.34 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: IJV N: 1280 nnz: 12029 Start 3568: bcsc_shm_test_bcsc_spmv_time_mm2 3172/3626 Test #3569: bcsc_shm_test_bvec_gemv_tests ...........................................***Timeout 338.34 sec ischedInit: The thread number has been automatically set to 256 Case Float - Sequential: SUCCESS Case Double - Sequential: SUCCESS Case Complex32 - Sequential: SUCCESS Case Complex64 - Sequential: SUCCESS Case Float - Static: SUCCESS Case Double - Static: SUCCESS Case Complex32 - Static: SUCCESS Case Complex64 - Static: SUCCESS -- All tests PASSED -- Start 3569: bcsc_shm_test_bvec_gemv_tests 3172/3626 Test #3571: bcsc_shm_test_bvec_applyorder_tests .....................................***Timeout 336.88 sec Start 3571: bcsc_shm_test_bvec_applyorder_tests 3172/3626 Test #3578: bcsc_mpi_rep_test_bcsc_spmv_tests_hb ....................................***Timeout 334.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 3578: bcsc_mpi_rep_test_bcsc_spmv_tests_hb 3172/3626 Test #3588: bcsc_mpi_rep_test_bvec_gemv_tests .......................................***Timeout 333.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Case Float - Sequential: SUCCESS Case Double - Sequential: SUCCESS Case Complex32 - Sequential: SUCCESS Start 3588: bcsc_mpi_rep_test_bvec_gemv_tests 3172/3626 Test #3590: bcsc_mpi_rep_test_bvec_applyorder_tests .................................***Timeout 334.07 sec Start 3590: bcsc_mpi_rep_test_bvec_applyorder_tests 3172/3626 Test #3597: bcsc_mpi_dst_test_bcsc_spmv_tests_hb ....................................***Timeout 332.42 sec Start 3597: bcsc_mpi_dst_test_bcsc_spmv_tests_hb 3172/3626 Test #3609: fortran_shm_fsimple .....................................................***Timeout 329.72 sec Start 3609: fortran_shm_fsimple 3172/3626 Test #3611: fortran_shm_flaplacian ..................................................***Timeout 328.11 sec Start 3611: fortran_shm_flaplacian 3172/3626 Test #3617: fortran_shm_fusermat_csr ................................................***Timeout 323.94 sec Start 3617: fortran_shm_fusermat_csr 3172/3626 Test #3621: python_shm_simple .......................................................***Timeout 324.50 sec Start 3621: python_shm_simple 3172/3626 Test #3625: python_shm_simple_obj ...................................................***Timeout 325.67 sec Start 3625: python_shm_simple_obj Test #3063: mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_tqrcpbegin ...***Timeout 641.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.212215e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.926422e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.774312e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.232117e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.307520e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.453913e+00 s Time to initialize coeftab 2.164193e+00 s Time to factorize 1.127011e+02 s (46.00 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Test #3076: mpi_dst_example_simple_lap_s_facto1_sched4_kway_svdend ..................***Timeout 641.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.432967e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.340721e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.911155e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.136594e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.759948e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.513010e+00 s Time to initialize coeftab 1.976179e+00 s Test #3077: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_svdbegin .....***Timeout 641.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.718513e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.106861e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.465723e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.697969e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.692398e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.527439e+00 s Time to initialize coeftab 9.664864e+00 s Test #3079: mpi_dst_example_simple_lap_s_facto1_sched4_not_pqrcpbegin ...............***Timeout 641.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.182781e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.374887e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.420482e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.324120e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.855432e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.937682e+00 s Time to initialize coeftab 5.351009e+00 s Time to factorize 8.550692e+01 s (62.67 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.3 Ko / 44.3 Ko Test #3091: mpi_dst_example_simple_lap_s_facto1_sched4_not_tqrcpbegin ...............***Timeout 641.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.495070e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.950452e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.503448e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.440002e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.377221e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.245740e+00 s Time to initialize coeftab 9.462867e+00 s Time to factorize 8.445098e+01 s (63.46 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44 Ko / 44.3 Ko ------------------------------------------------ Total 68.2 Ko / 68.5 Ko Test #3095: mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpbegin ...***Timeout 641.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.292004e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.078742e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.536109e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.256016e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.986759e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.979162e+00 s Time to initialize coeftab 3.906884e+00 s Time to factorize 9.691399e+01 s (55.30 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 44.2 Ko / 44.3 Ko ------------------------------------------------ Total 68.4 Ko / 68.5 Ko Test #3117: mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrcpbegin ...............***Timeout 641.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.307910e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.152761e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.795717e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.111882e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.423332e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.604139e+00 s Time to initialize coeftab 4.527523e+00 s Time to factorize 1.017318e+02 s (100.50 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 112 Ko / 113 Ko Test #3119: mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrcpbegin ..............***Timeout 641.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.578304e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.510676e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.642696e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.684866e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.309763e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.127780e+00 s Time to initialize coeftab 1.014903e+01 s Test #3127: mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpbegin ...***Timeout 641.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.388591e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.832661e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.769064e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.458964e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.098061e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.835787e+00 s Time to initialize coeftab 3.654596e+00 s Time to factorize 1.117296e+02 s (91.51 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 24.2 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88 Ko / 88.6 Ko ------------------------------------------------ Total 112 Ko / 113 Ko Test #3137: mpi_dst_example_simple_lap_d_facto0_sched4_not_svdbegin .................***Timeout 641.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.153425e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.522627e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.327458e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.082396e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.194122e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.369695e-01 s Time to initialize coeftab 9.260762e-01 s Test #3139: mpi_dst_example_simple_lap_d_facto0_sched4_kway_svdbegin ................***Timeout 641.81 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.014529e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.190319e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.650709e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.725517e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.095513e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.298895e+00 s Time to initialize coeftab 3.223624e+00 s Test #3141: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_svdbegin .....***Timeout 641.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.702153e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.134528e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.864082e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.090231e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.849748e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.009579e+00 s Time to initialize coeftab 3.738158e+00 s Test #3155: mpi_dst_example_simple_lap_d_facto0_sched4_not_tqrcpbegin ...............***Timeout 639.26 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.002116e+02 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.092500e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.736319e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.351111e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.255358e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.070389e+00 s Time to initialize coeftab 3.207001e+00 s Test #3157: mpi_dst_example_simple_lap_d_facto0_sched4_kway_tqrcpbegin ..............***Timeout 639.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.937150e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.526041e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.519145e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.863594e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.093238e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.792873e+00 s Time to initialize coeftab 4.939745e+00 s Test #3159: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpbegin ...***Timeout 632.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.547691e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.234836e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.068236e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.051489e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.214133e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.270248e+00 s Time to initialize coeftab 5.549199e+00 s Test #3163: mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrrtbegin ..............***Timeout 631.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.693130e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.802840e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.477635e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.287418e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.766527e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.427039e+00 s Time to initialize coeftab 2.100631e+00 s Test #3179: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpbegin ...***Timeout 631.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.360855e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.224011e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.468226e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.853558e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.521452e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.394121e+01 s Time to initialize coeftab 4.702868e+00 s Time to factorize 7.496286e+01 s (71.49 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Test #3259: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrrtbegin ..............***Timeout 631.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.511529e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.155508e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.877251e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.019148e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.286375e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.948459e+00 s Time to initialize coeftab 9.881014e+00 s Start 3259: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrrtbegin Test #3261: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtbegin ...***Timeout 631.05 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.239007e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.440722e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.694035e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.577445e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.093958e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.700654e+00 s Time to initialize coeftab 7.214162e+00 s Start 3261: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtbegin Test #3262: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtend .....***Timeout 630.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.209174e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.441840e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.928400e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.069723e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.778580e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.005767e+01 s Time to initialize coeftab 1.989096e+00 s Start 3262: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtend Test #3264: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpilu1 ...............***Timeout 630.43 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.313536e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.865675e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.287407e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.866955e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.720701e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.391950e+00 s Time to initialize coeftab 3.062123e+00 s Start 3264: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpilu1 Test #3265: mpi_dst_example_simple_lap_c_facto1_sched4_not_svdbegin .................***Timeout 630.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.752003e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.202004e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.701631e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.195042e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.641265e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.650999e+00 s Time to initialize coeftab 5.425001e+00 s Start 3265: mpi_dst_example_simple_lap_c_facto1_sched4_not_svdbegin Test #3266: mpi_dst_example_simple_lap_c_facto1_sched4_not_svdend ...................***Timeout 630.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.680490e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.929007e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.058366e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.399088e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.825220e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.877396e+00 s Time to initialize coeftab 8.874222e-01 s Start 3266: mpi_dst_example_simple_lap_c_facto1_sched4_not_svdend Test #3166: mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtend .....***Timeout 629.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.470149e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.832360e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.685213e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.568430e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.182418e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.241499e+01 s Time to initialize coeftab 1.661102e+00 s Test #3167: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpilu0 ...............***Timeout 629.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.658312e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.651260e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.290690e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.476749e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.679270e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.106378e+00 s Time to initialize coeftab 5.902364e-01 s Time to factorize 1.377782e+02 s (37.62 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Test #3168: mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpilu1 ...............***Timeout 629.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.034747e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.725938e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.757761e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 5.06 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.821229e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.107066e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.983936e+00 s Time to initialize coeftab 2.399894e+00 s Time to factorize 9.852866e+01 s (52.61 KFlop/s) Number of operations 25.89 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Test #3169: mpi_dst_example_simple_lap_d_facto1_sched4_not_svdbegin .................***Timeout 629.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.794405e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.768425e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.457210e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.081244e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.664935e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.492038e+00 s Time to initialize coeftab 2.147342e+00 s Test #3174: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_svdend .......***Timeout 629.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.314843e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.934005e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.224549e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.394298e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.581807e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.478004e+00 s Time to initialize coeftab 1.125656e+00 s Test #3175: mpi_dst_example_simple_lap_d_facto1_sched4_not_pqrcpbegin ...............***Timeout 627.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.426494e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.611102e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.925406e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.063916e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.697304e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.916086e+00 s Time to initialize coeftab 3.536366e+00 s Test #3177: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpbegin ..............***Timeout 625.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.773967e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.020048e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.035104e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.921526e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.660422e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.853878e+00 s Time to initialize coeftab 2.323803e+00 s Time to factorize 7.900557e+01 s (67.83 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Test #3181: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrcpbegin ...............***Timeout 625.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.160018e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.873711e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.308405e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.829789e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.446787e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.006046e+00 s Time to initialize coeftab 4.716532e+00 s Test #3183: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrcpbegin ..............***Timeout 625.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.702555e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.408473e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.567672e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.198502e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.144378e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.013452e-01 s Time to initialize coeftab 2.002516e+00 s Test #3267: mpi_dst_example_simple_lap_c_facto1_sched4_kway_svdbegin ................***Timeout 625.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.417813e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.145969e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.043896e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.385899e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.120009e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.040781e+00 s Time to initialize coeftab 1.360700e+01 s Start 3267: mpi_dst_example_simple_lap_c_facto1_sched4_kway_svdbegin Test #3268: mpi_dst_example_simple_lap_c_facto1_sched4_kway_svdend ..................***Timeout 625.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.453368e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.636853e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.213499e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.629654e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.492749e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.133457e+00 s Time to initialize coeftab 1.282003e+00 s Start 3268: mpi_dst_example_simple_lap_c_facto1_sched4_kway_svdend Test #3269: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_svdbegin .....***Timeout 625.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.391406e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.042954e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.017666e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.182154e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.455578e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.844925e+00 s Time to initialize coeftab 4.607569e+00 s Start 3269: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_svdbegin Test #3270: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_svdend .......***Timeout 624.99 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.657777e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.783077e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.119797e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.297548e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.484528e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.464464e+00 s Time to initialize coeftab 5.528631e-01 s Start 3270: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_svdend Test #3271: mpi_dst_example_simple_lap_c_facto1_sched4_not_pqrcpbegin ...............***Timeout 624.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.772030e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.453080e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.057604e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.464966e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.498962e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.853531e-01 s Time to initialize coeftab 2.025196e+00 s Start 3271: mpi_dst_example_simple_lap_c_facto1_sched4_not_pqrcpbegin Test #3273: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpbegin ..............***Timeout 624.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : 1: 300 1140 2: 200 760 3: 200 660 Ordering method is: Scotch Time to compute ordering 4.598443e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.632987e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.738226e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.995323e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.799467e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.456699e+00 s Time to initialize coeftab 5.138317e+00 s Start 3273: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpbegin Test #3275: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpbegin ...***Timeout 624.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.425239e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.048437e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.571911e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.989513e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.501092e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.073460e+00 s Time to initialize coeftab 4.119104e+00 s Start 3275: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpbegin Test #3276: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpend .....***Timeout 624.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.046997e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.059660e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.605777e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.476217e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.149681e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.656408e+00 s Time to initialize coeftab 2.746189e+00 s Time to factorize 7.613534e+01 s (286.59 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Start 3276: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpend Test #3277: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrcpbegin ...............***Timeout 623.79 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.652755e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.719022e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.510135e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.436941e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.482470e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.242738e+00 s Time to initialize coeftab 5.730005e+00 s Start 3277: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrcpbegin Test #3278: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrcpend .................***Timeout 623.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.144751e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.941678e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.223070e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.022149e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.953776e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.867838e+00 s Time to initialize coeftab 1.475214e+00 s Start 3278: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrcpend Test #3279: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrcpbegin ..............***Timeout 622.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.477958e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.282069e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.511205e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.458859e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.374372e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.115186e+01 s Time to initialize coeftab 1.811372e+01 s Start 3279: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrcpbegin Test #3280: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrcpend ................***Timeout 621.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.340408e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.350119e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.897252e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.971583e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.240559e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.635964e+00 s Time to initialize coeftab 8.093345e-01 s Start 3280: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrcpend Test #3281: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpbegin ...***Timeout 621.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.239733e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.592616e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.226997e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.005745e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.683590e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.070158e+00 s Time to initialize coeftab 1.475757e+01 s Start 3281: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpbegin Test #3282: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpend .....***Timeout 619.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.602315e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.728492e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.891873e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.482309e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.974429e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.850244e+00 s Time to initialize coeftab 8.599991e-01 s Start 3282: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpend Test #3283: mpi_dst_example_simple_lap_c_facto1_sched4_not_tqrcpbegin ...............***Timeout 619.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.243396e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.414838e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.610210e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.440315e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.592326e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.507401e+00 s Time to initialize coeftab 1.063001e+01 s Start 3283: mpi_dst_example_simple_lap_c_facto1_sched4_not_tqrcpbegin Test #3285: mpi_dst_example_simple_lap_c_facto1_sched4_kway_tqrcpbegin ..............***Timeout 617.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.334185e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.117224e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.988975e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.873855e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.349925e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.970426e+00 s Time to initialize coeftab 7.341756e+00 s Start 3285: mpi_dst_example_simple_lap_c_facto1_sched4_kway_tqrcpbegin Test #3286: mpi_dst_example_simple_lap_c_facto1_sched4_kway_tqrcpend ................***Timeout 617.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.096528e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.058233e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.738789e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.719268e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.311416e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.471237e+00 s Time to initialize coeftab 8.780100e-01 s Start 3286: mpi_dst_example_simple_lap_c_facto1_sched4_kway_tqrcpend Test #3287: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpbegin ...***Timeout 616.76 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.501139e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.041508e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.292077e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.067318e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.396156e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.655558e+00 s Time to initialize coeftab 8.316154e+00 s Start 3287: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpbegin Test #3288: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpend .....***Timeout 616.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.657063e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.677548e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.292110e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.094564e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.803047e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.960284e+00 s Time to initialize coeftab 6.436658e-01 s Start 3288: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpend Test #3289: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrrtbegin ...............***Timeout 617.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.630355e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.497276e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.505170e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.900721e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.514025e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.148396e+00 s Time to initialize coeftab 8.109455e+00 s Start 3289: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrrtbegin Test #3290: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrrtend .................***Timeout 616.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.834191e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.144863e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.274473e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.597414e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.074439e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.299571e+00 s Time to initialize coeftab 2.579493e+00 s Start 3290: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrrtend Test #3291: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrrtbegin ..............***Timeout 616.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.050822e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.174567e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.059006e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.200445e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.317745e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.793410e+00 s Time to initialize coeftab 1.497087e+01 s Start 3291: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrrtbegin Test #3292: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrrtend ................***Timeout 616.82 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.305088e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.078696e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.663691e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.756983e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.425527e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.234851e+00 s Time to initialize coeftab 1.492764e+00 s Start 3292: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrrtend Test #3293: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtbegin ...***Timeout 616.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.662645e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.162981e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.326451e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.855705e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.935738e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.108417e+00 s Time to initialize coeftab 1.763898e+01 s Start 3293: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtbegin Test #3294: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtend .....***Timeout 616.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.151238e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.699693e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.193395e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.233594e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.441503e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.275355e+00 s Time to initialize coeftab 1.483214e+00 s Start 3294: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtend Test #3295: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpilu0 ...............***Timeout 616.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.080375e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.962564e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.883080e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.017432e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.309940e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.563947e+00 s Time to initialize coeftab 5.243651e-01 s Start 3295: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpilu0 Test #3296: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpilu1 ...............***Timeout 616.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.633937e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.497940e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.975304e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.771737e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.604798e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.318037e+00 s Time to initialize coeftab 1.812033e+00 s Start 3296: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpilu1 Test #3297: mpi_dst_example_simple_lap_c_facto2_sched4_not_svdbegin .................***Timeout 616.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.512961e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.697293e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.029050e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.426078e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.403139e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.949397e+00 s Time to initialize coeftab 1.793108e+01 s Start 3297: mpi_dst_example_simple_lap_c_facto2_sched4_not_svdbegin Test #3299: mpi_dst_example_simple_lap_c_facto2_sched4_kway_svdbegin ................***Timeout 615.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.474104e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.190778e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.441736e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.723933e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.376451e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.055722e+00 s Time to initialize coeftab 1.408233e+01 s Start 3299: mpi_dst_example_simple_lap_c_facto2_sched4_kway_svdbegin Test #3300: mpi_dst_example_simple_lap_c_facto2_sched4_kway_svdend ..................***Timeout 615.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.112118e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.336349e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.228632e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.127464e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.003788e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.141825e+01 s Time to initialize coeftab 1.319222e+00 s Start 3300: mpi_dst_example_simple_lap_c_facto2_sched4_kway_svdend Test #3301: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_svdbegin .....***Timeout 614.89 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.496009e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.616755e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.039539e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.522594e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.138942e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.942645e+00 s Time to initialize coeftab 1.890911e+01 s Start 3301: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_svdbegin Test #3302: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_svdend .......***Timeout 614.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.255915e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.004456e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.672109e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.067156e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.232631e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.764108e+00 s Time to initialize coeftab 1.011176e+00 s Start 3302: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_svdend Test #3303: mpi_dst_example_simple_lap_c_facto2_sched4_not_pqrcpbegin ...............***Timeout 614.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.633584e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.015073e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.135072e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.374167e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.108107e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.091923e+00 s Time to initialize coeftab 8.870350e+00 s Start 3303: mpi_dst_example_simple_lap_c_facto2_sched4_not_pqrcpbegin Test #3305: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpbegin ..............***Timeout 613.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch 1: 300 1140 2: 200 760 3: 200 660 Time to compute ordering 5.604050e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.224407e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.206760e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.712304e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.564905e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.171226e+00 s Time to initialize coeftab 3.305772e+00 s Start 3305: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpbegin Test #3325: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtbegin ...***Timeout 613.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.146648e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.398556e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.893332e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 4.779898e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.886996e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.450973e+00 s Time to initialize coeftab 1.190264e+01 s Start 3325: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtbegin Test #3326: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtend .....***Timeout 613.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.280924e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.281672e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.592670e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.014547e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.214495e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.555025e+00 s Time to initialize coeftab 3.262646e-01 s Start 3326: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtend Test #3329: mpi_dst_example_simple_lap_c_facto3_sched4_not_svdbegin .................***Timeout 613.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.036075e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.687878e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.878668e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.220516e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.968597e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.434811e+00 s Time to initialize coeftab 1.069582e+01 s Start 3329: mpi_dst_example_simple_lap_c_facto3_sched4_not_svdbegin Test #3331: mpi_dst_example_simple_lap_c_facto3_sched4_kway_svdbegin ................***Timeout 612.11 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.296547e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.827393e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.294214e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.692668e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.278907e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.710109e+00 s Time to initialize coeftab 5.250877e+00 s Start 3331: mpi_dst_example_simple_lap_c_facto3_sched4_kway_svdbegin Test #3332: mpi_dst_example_simple_lap_c_facto3_sched4_kway_svdend ..................***Timeout 612.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.813941e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.501980e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.436227e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.724124e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.981610e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.019822e+00 s Time to initialize coeftab 1.285959e+00 s Start 3332: mpi_dst_example_simple_lap_c_facto3_sched4_kway_svdend Test #3333: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_svdbegin .....***Timeout 610.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.613492e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.887912e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.723799e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.949241e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.726118e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.876582e+00 s Time to initialize coeftab 8.448138e+00 s Start 3333: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_svdbegin Test #3343: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrcpbegin ..............***Timeout 609.15 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.536858e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.879074e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.324649e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.367215e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.884102e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.040321e+00 s Time to initialize coeftab 8.269709e+00 s Start 3343: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrcpbegin Test #3345: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpbegin ...***Timeout 608.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.388534e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.633241e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.762494e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.928417e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.185873e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.055546e+01 s Time to initialize coeftab 1.376413e+01 s Start 3345: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpbegin Test #3349: mpi_dst_example_simple_lap_c_facto3_sched4_kway_tqrcpbegin ..............***Timeout 607.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.302201e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.361894e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.706161e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.151031e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.069874e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.080740e+01 s Time to initialize coeftab 1.467408e+01 s Start 3349: mpi_dst_example_simple_lap_c_facto3_sched4_kway_tqrcpbegin Test #3351: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpbegin ...***Timeout 607.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : 1: 300 1140 2: 200 760 3: 200 660 Ordering method is: Scotch Time to compute ordering 3.982777e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.420862e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.455547e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.565988e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.255034e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.479775e+00 s Time to initialize coeftab 1.199326e+01 s Start 3351: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpbegin Test #3361: mpi_dst_example_simple_lap_c_facto4_sched4_not_svdbegin .................***Timeout 606.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.667188e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.666349e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.711247e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.295967e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.035628e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.250116e+00 s Time to initialize coeftab 6.178998e+00 s Start 3361: mpi_dst_example_simple_lap_c_facto4_sched4_not_svdbegin Test #3362: mpi_dst_example_simple_lap_c_facto4_sched4_not_svdend ...................***Timeout 605.98 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.084065e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.434753e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.325272e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.826044e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.235358e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.026839e+00 s Time to initialize coeftab 8.104203e-01 s Start 3362: mpi_dst_example_simple_lap_c_facto4_sched4_not_svdend Test #3363: mpi_dst_example_simple_lap_c_facto4_sched4_kway_svdbegin ................***Timeout 605.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.458733e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.766513e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.080543e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.461200e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.134780e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.696460e+00 s Time to initialize coeftab 4.287250e+00 s Start 3363: mpi_dst_example_simple_lap_c_facto4_sched4_kway_svdbegin Test #3365: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_svdbegin .....***Timeout 605.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.175218e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.396537e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.308146e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.941942e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.906166e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 7.050325e+00 s Time to initialize coeftab 1.333000e+01 s Start 3365: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_svdbegin Test #3367: mpi_dst_example_simple_lap_c_facto4_sched4_not_pqrcpbegin ...............***Timeout 605.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.747589e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.662249e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.341167e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.678937e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.597827e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.299817e+00 s Time to initialize coeftab 6.142011e+00 s Start 3367: mpi_dst_example_simple_lap_c_facto4_sched4_not_pqrcpbegin Test #3369: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpbegin ..............***Timeout 605.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.033808e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.280122e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.980192e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.887459e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.450934e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 7.546273e+00 s Time to initialize coeftab 4.397407e+00 s Start 3369: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpbegin Test #3371: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpbegin ...***Timeout 605.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.031575e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.800113e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.809500e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.460663e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.106115e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.039951e+00 s Time to initialize coeftab 4.385153e+00 s Start 3371: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpbegin Test #3373: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrcpbegin ...............***Timeout 604.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.212890e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.019941e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.130016e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.100924e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.465000e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 8.610342e+00 s Time to initialize coeftab 1.549592e+01 s Start 3373: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrcpbegin Test #3374: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrcpend .................***Timeout 603.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.126845e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.850869e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.130668e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.143661e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.470976e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.983899e+00 s Time to initialize coeftab 1.667017e+00 s Start 3374: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrcpend Test #3375: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrcpbegin ..............***Timeout 602.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.468727e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.512580e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.701496e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.069510e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.777944e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.358012e+00 s Time to initialize coeftab 1.235400e+01 s Start 3375: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrcpbegin Test #3377: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpbegin ...***Timeout 602.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.509509e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.337027e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.231611e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.592637e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.464111e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.282634e+00 s Time to initialize coeftab 8.179366e+00 s Start 3377: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpbegin Test #3379: mpi_dst_example_simple_lap_c_facto4_sched4_not_tqrcpbegin ...............***Timeout 602.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.043204e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.301246e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.402719e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.770082e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.999925e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.170893e+00 s Time to initialize coeftab 2.063409e+01 s Start 3379: mpi_dst_example_simple_lap_c_facto4_sched4_not_tqrcpbegin Test #3381: mpi_dst_example_simple_lap_c_facto4_sched4_kway_tqrcpbegin ..............***Timeout 601.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.780649e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.779330e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.297642e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.608181e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.906033e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 9.362294e+00 s Time to initialize coeftab 1.787880e+01 s Start 3381: mpi_dst_example_simple_lap_c_facto4_sched4_kway_tqrcpbegin Test #3383: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpbegin ...***Timeout 601.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.567770e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.101665e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.631300e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.492919e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.729067e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.054552e+01 s Time to initialize coeftab 1.538694e+01 s Start 3383: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpbegin Test #3385: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrrtbegin ...............***Timeout 600.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.767868e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.828451e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.143027e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.109116e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.945684e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.596289e+00 s Time to initialize coeftab 1.459902e+01 s Start 3385: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrrtbegin Test #3387: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrrtbegin ..............***Timeout 598.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.674973e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.020507e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.671050e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.924462e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.506607e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.783548e+00 s Time to initialize coeftab 1.502198e+01 s Start 3387: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrrtbegin Test #3389: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtbegin ...***Timeout 597.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.944423e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.313377e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.255652e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.571436e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.285792e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.730375e+00 s Time to initialize coeftab 1.856405e+01 s Start 3389: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtbegin Test #3392: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpilu1 ...............***Timeout 595.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.746455e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.092178e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.278278e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.775587e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.865132e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.886051e+00 s Time to initialize coeftab 6.781799e-01 s Start 3392: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpilu1 Test #3393: mpi_dst_example_simple_lap_z_facto0_sched4_not_svdbegin .................***Timeout 595.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.882547e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.246639e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.185409e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.100465e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.322984e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.904708e+00 s Time to initialize coeftab 1.074751e+01 s Start 3393: mpi_dst_example_simple_lap_z_facto0_sched4_not_svdbegin Test #3394: mpi_dst_example_simple_lap_z_facto0_sched4_not_svdend ...................***Timeout 595.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.071936e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.003479e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.780511e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.353314e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.542863e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.431092e-01 s Time to initialize coeftab 4.834361e-01 s Start 3394: mpi_dst_example_simple_lap_z_facto0_sched4_not_svdend Test #3395: mpi_dst_example_simple_lap_z_facto0_sched4_kway_svdbegin ................***Timeout 594.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.516035e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.445221e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.151648e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.535117e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.623065e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.939398e+00 s Time to initialize coeftab 1.246377e+01 s Start 3395: mpi_dst_example_simple_lap_z_facto0_sched4_kway_svdbegin Test #3396: mpi_dst_example_simple_lap_z_facto0_sched4_kway_svdend ..................***Timeout 592.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.735126e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.333656e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.822164e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.267844e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.414822e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.537810e+01 s Time to initialize coeftab 2.308940e+00 s Start 3396: mpi_dst_example_simple_lap_z_facto0_sched4_kway_svdend Test #3397: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_svdbegin .....***Timeout 591.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.554463e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.200563e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.872893e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.270484e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.895422e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.329927e+00 s Time to initialize coeftab 1.202387e+01 s Start 3397: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_svdbegin Test #3399: mpi_dst_example_simple_lap_z_facto0_sched4_not_pqrcpbegin ...............***Timeout 588.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.930033e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.405992e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.371544e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.138398e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.276565e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.072055e+00 s Time to initialize coeftab 4.838167e+00 s Start 3399: mpi_dst_example_simple_lap_z_facto0_sched4_not_pqrcpbegin Test #3401: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpbegin ..............***Timeout 587.92 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.238044e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.915331e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.779055e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.564895e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.345343e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.108042e+00 s Time to initialize coeftab 3.844937e+00 s Start 3401: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpbegin Test #3403: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpbegin ...***Timeout 587.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.472250e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.130801e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.404209e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.702524e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.686579e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.376752e+00 s Time to initialize coeftab 3.937529e+00 s Start 3403: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpbegin Test #3405: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrcpbegin ...............***Timeout 587.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.332140e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.894174e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.873453e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.245262e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.872222e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.334982e+00 s Time to initialize coeftab 1.175269e+01 s Start 3405: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrcpbegin Test #3407: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrcpbegin ..............***Timeout 587.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.412221e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.289082e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.396520e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.543738e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.620032e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.937369e+00 s Time to initialize coeftab 1.136816e+01 s Start 3407: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrcpbegin Test #3409: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpbegin ...***Timeout 586.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.613787e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.001645e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.864493e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.978512e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.882570e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.640505e+00 s Time to initialize coeftab 8.968784e+00 s Start 3409: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpbegin Test #3411: mpi_dst_example_simple_lap_z_facto0_sched4_not_tqrcpbegin ...............***Timeout 585.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.401824e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.100678e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.040216e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.690098e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.456762e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.676659e+00 s Time to initialize coeftab 1.055298e+01 s Start 3411: mpi_dst_example_simple_lap_z_facto0_sched4_not_tqrcpbegin Test #3412: mpi_dst_example_simple_lap_z_facto0_sched4_not_tqrcpend .................***Timeout 585.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.360194e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.492165e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.238689e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.351214e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.151347e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.316642e+00 s Time to initialize coeftab 2.853157e+00 s Start 3412: mpi_dst_example_simple_lap_z_facto0_sched4_not_tqrcpend Test #3413: mpi_dst_example_simple_lap_z_facto0_sched4_kway_tqrcpbegin ..............***Timeout 584.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.535304e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.188894e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.256200e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.641538e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.478335e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.979023e+00 s Time to initialize coeftab 1.530501e+01 s Start 3413: mpi_dst_example_simple_lap_z_facto0_sched4_kway_tqrcpbegin Test #3217: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpbegin ...***Timeout 583.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.638838e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.712423e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.064264e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.176627e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.280850e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.077218e+00 s Time to initialize coeftab 9.142048e+00 s Test #3252: mpi_dst_example_simple_lap_c_facto0_sched4_not_tqrcpend .................***Timeout 582.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.610841e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.198152e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.960839e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.121108e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.796893e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.840404e+00 s Time to initialize coeftab 2.231355e+00 s Test #3414: mpi_dst_example_simple_lap_z_facto0_sched4_kway_tqrcpend ................***Timeout 582.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.209400e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.954053e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.197698e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.434166e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.062645e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.650704e+00 s Time to initialize coeftab 1.825915e+00 s Start 3414: mpi_dst_example_simple_lap_z_facto0_sched4_kway_tqrcpend Test #3415: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpbegin ...***Timeout 581.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.320227e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.137934e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.475181e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.427618e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.239889e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.655763e+00 s Time to initialize coeftab 9.516285e+00 s Start 3415: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpbegin Test #3416: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpend .....***Timeout 580.50 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.230224e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.201630e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.096207e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.078517e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.349654e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.137276e+00 s Time to initialize coeftab 8.643571e-01 s Start 3416: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpend Test #3417: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrrtbegin ...............***Timeout 579.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.622226e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.839230e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.616674e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.190024e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.919078e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.423999e+01 s Time to initialize coeftab 1.419420e+01 s Start 3417: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrrtbegin Test #3418: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrrtend .................***Timeout 578.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.897491e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.618657e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.976959e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.933927e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.625824e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.247371e+00 s Time to initialize coeftab 3.123108e+00 s Start 3418: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrrtend Test #3419: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrrtbegin ..............***Timeout 578.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.580041e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.050233e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.653376e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.563359e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.552076e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.832474e+00 s Time to initialize coeftab 9.894128e+00 s Start 3419: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrrtbegin Test #3420: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrrtend ................***Timeout 577.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.321768e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.292695e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.813839e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.261609e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.831045e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.891200e+00 s Time to initialize coeftab 2.393278e+00 s Start 3420: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrrtend Test #3421: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtbegin ...***Timeout 576.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.924864e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.651823e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.817902e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.184019e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.590513e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.959453e+00 s Time to initialize coeftab 1.132379e+01 s Start 3421: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtbegin Test #3422: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtend .....***Timeout 576.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.362427e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.067333e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.658760e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.756167e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.496113e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.228949e+01 s Time to initialize coeftab 1.440855e+00 s Start 3422: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtend Test #3423: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpilu0 ...............***Timeout 575.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.703077e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.273800e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.351862e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.327662e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.539214e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.750487e+00 s Time to initialize coeftab 2.189468e+00 s Start 3423: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpilu0 Test #3424: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpilu1 ...............***Timeout 575.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.174477e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.029724e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.140494e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.970668e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.057974e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.340957e+00 s Time to initialize coeftab 1.888632e+00 s Start 3424: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpilu1 Test #3425: mpi_dst_example_simple_lap_z_facto1_sched4_not_svdbegin .................***Timeout 574.79 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.447300e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.717059e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.227174e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.942373e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.081657e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.097254e+01 s Time to initialize coeftab 1.621472e+01 s Start 3425: mpi_dst_example_simple_lap_z_facto1_sched4_not_svdbegin Test #3426: mpi_dst_example_simple_lap_z_facto1_sched4_not_svdend ...................***Timeout 574.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.804832e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.631314e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.819097e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.006442e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.757104e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.632667e+00 s Time to initialize coeftab 5.200249e+00 s Start 3426: mpi_dst_example_simple_lap_z_facto1_sched4_not_svdend Test #3427: mpi_dst_example_simple_lap_z_facto1_sched4_kway_svdbegin ................***Timeout 573.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.903552e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.719191e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.390729e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.999269e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.702112e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.967061e+00 s Time to initialize coeftab 1.285821e+01 s Start 3427: mpi_dst_example_simple_lap_z_facto1_sched4_kway_svdbegin Test #3428: mpi_dst_example_simple_lap_z_facto1_sched4_kway_svdend ..................***Timeout 572.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.730425e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.796495e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.539904e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.731023e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.106220e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.134858e+01 s Time to initialize coeftab 1.924795e+00 s Start 3428: mpi_dst_example_simple_lap_z_facto1_sched4_kway_svdend Test #3429: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_svdbegin .....***Timeout 571.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.081194e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.097808e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.884359e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.261684e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.914244e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.181471e+01 s Time to initialize coeftab 1.841416e+01 s Start 3429: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_svdbegin Test #3430: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_svdend .......***Timeout 570.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.410562e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.527548e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.048431e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.163173e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.327914e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.101821e+01 s Time to initialize coeftab 1.709024e+00 s Start 3430: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_svdend Test #3431: mpi_dst_example_simple_lap_z_facto1_sched4_not_pqrcpbegin ...............***Timeout 569.28 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.097507e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.544588e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.630127e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.222300e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.405974e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.114508e-01 s Time to initialize coeftab 6.363024e+00 s Start 3431: mpi_dst_example_simple_lap_z_facto1_sched4_not_pqrcpbegin Test #3433: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpbegin ..............***Timeout 568.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.729278e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.320404e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.709886e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.167606e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.443971e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.788755e+00 s Time to initialize coeftab 6.799714e+00 s Start 3433: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpbegin Test #3435: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpbegin ...***Timeout 568.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.381427e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.188318e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.878828e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.859762e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.981588e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.422019e+01 s Time to initialize coeftab 6.293642e+00 s Start 3435: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpbegin Test #3436: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpend .....***Timeout 568.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.745891e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.532165e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.827537e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.704268e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.592832e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.322903e+00 s Time to initialize coeftab 5.550070e-01 s Time to factorize 1.004531e+02 s (217.21 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 3436: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpend Test #3437: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrcpbegin ...............***Timeout 568.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 3: 200 660 2: 200 760 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.114382e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.631019e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.657929e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.229507e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.637338e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.567068e+00 s Time to initialize coeftab 1.744502e+01 s Start 3437: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrcpbegin Test #3439: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrcpbegin ..............***Timeout 567.11 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.376731e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.520322e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.694813e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.997566e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.299699e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.972523e+00 s Time to initialize coeftab 1.220575e+01 s Start 3439: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrcpbegin Test #3440: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrcpend ................***Timeout 566.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.878134e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.463777e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.535537e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.870831e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.778193e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.216118e+01 s Time to initialize coeftab 3.358106e+00 s Start 3440: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrcpend Test #3441: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpbegin ...***Timeout 566.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.416627e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.678720e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.382119e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.827727e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.101208e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.546092e+00 s Time to initialize coeftab 1.720103e+01 s Start 3441: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpbegin Test #3189: mpi_dst_example_simple_lap_d_facto1_sched4_kway_tqrcpbegin ..............***Timeout 535.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.691673e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.324974e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.173927e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 3.369644e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.388211e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.319586e+00 s Time to initialize coeftab 7.737982e+00 s Test #3190: mpi_dst_example_simple_lap_d_facto1_sched4_kway_tqrcpend ................***Timeout 535.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.007956e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.537790e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.575541e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.978981e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.712205e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.872098e+00 s Time to initialize coeftab 8.827152e-01 s Test #3191: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpbegin ...***Timeout 535.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.750730e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.482766e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.430001e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.689502e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.854932e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.579416e+00 s Time to initialize coeftab 1.046052e+01 s Test #3192: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpend .....***Timeout 535.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.812079e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.906265e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.094793e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.007132e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.729754e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.163448e+01 s Time to initialize coeftab 1.903123e+00 s Time to factorize 4.020740e+01 s (133.29 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3193: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrrtbegin ...............***Timeout 534.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.549962e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.010133e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.985379e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.463696e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.669223e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.063898e+00 s Time to initialize coeftab 6.719833e+00 s Test #3194: mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrrtend .................***Timeout 534.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.525706e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.368140e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.253368e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.363728e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.653640e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.540918e+00 s Time to initialize coeftab 2.881606e+00 s Test #3195: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrrtbegin ..............***Timeout 534.79 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.921101e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.034400e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.243436e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.747219e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.227567e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.655803e+00 s Time to initialize coeftab 8.306091e+00 s Test #3196: mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrrtend ................***Timeout 534.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch 1: 300 1140 2: 200 760 3: 200 660 Time to compute ordering 3.018917e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.018307e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.173322e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.335776e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.699968e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.043139e+01 s Time to initialize coeftab 3.117684e+00 s Time to factorize 4.889301e+01 s (109.61 KFlop/s) Number of operations 26.58 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Test #3197: mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtbegin ...***Timeout 534.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.484237e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.215458e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.216694e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.604193e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.579497e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.164861e+01 s Time to initialize coeftab 9.301118e+00 s Test #3199: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpilu0 ...............***Timeout 532.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.082515e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.699754e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.441472e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.793601e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.814686e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.889994e-02 s Time to initialize coeftab 3.047704e-01 s Test #3200: mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpilu1 ...............***Timeout 531.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.813385e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.829306e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.952237e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 5.23 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.152369e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.313472e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.329535e+00 s Time to initialize coeftab 2.479676e+00 s Test #3201: mpi_dst_example_simple_lap_d_facto2_sched4_not_svdbegin .................***Timeout 531.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.919593e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.967997e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.822157e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.983047e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.849255e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.218343e+01 s Time to initialize coeftab 1.118800e+01 s Test #3202: mpi_dst_example_simple_lap_d_facto2_sched4_not_svdend ...................***Timeout 530.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.674005e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.356587e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.156895e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.533161e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.910731e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.854318e+00 s Time to initialize coeftab 2.614886e+00 s Time to factorize 5.965738e+01 s (171.38 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3203: mpi_dst_example_simple_lap_d_facto2_sched4_kway_svdbegin ................***Timeout 529.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.157099e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.998288e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.236104e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.561352e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.361815e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.988551e+00 s Time to initialize coeftab 1.051947e+01 s Test #3205: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_svdbegin .....***Timeout 528.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.266080e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.729236e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.371642e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.233861e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.228916e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.361326e+00 s Time to initialize coeftab 7.762780e+00 s Test #3206: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_svdend .......***Timeout 528.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.964600e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.073146e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.897002e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.978240e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.860665e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.715608e+00 s Time to initialize coeftab 2.232719e+00 s Test #3207: mpi_dst_example_simple_lap_d_facto2_sched4_not_pqrcpbegin ...............***Timeout 528.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.315482e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.202803e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.035658e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.661290e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.010900e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.529439e+00 s Time to initialize coeftab 2.791228e+00 s Time to factorize 9.447218e+01 s (108.22 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3211: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpbegin ...***Timeout 527.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.709938e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.601534e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.841170e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.198488e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.834629e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.591922e+00 s Time to initialize coeftab 3.978988e+00 s Test #3213: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrcpbegin ...............***Timeout 527.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.876460e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.207675e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.663662e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.266961e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.629247e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.719174e+00 s Time to initialize coeftab 7.617088e+00 s Test #3214: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrcpend .................***Timeout 527.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.782151e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.242351e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.347424e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.226749e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.324565e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.014646e+00 s Time to initialize coeftab 2.474988e+00 s Time to factorize 6.390911e+01 s (159.98 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3215: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrcpbegin ..............***Timeout 527.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.873885e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.459370e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.049084e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.749167e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.938689e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.020662e+00 s Time to initialize coeftab 1.313977e+01 s Test #3216: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrcpend ................***Timeout 527.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.343515e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.497126e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.326328e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.565761e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.147698e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.457944e+00 s Time to initialize coeftab 1.945261e+00 s Test #3218: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpend .....***Timeout 526.79 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.280184e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.619965e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.283267e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.899983e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.277863e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.508415e+00 s Time to initialize coeftab 1.264292e+00 s Time to factorize 5.965001e+01 s (171.40 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Test #3219: mpi_dst_example_simple_lap_d_facto2_sched4_not_tqrcpbegin ...............***Timeout 526.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.506655e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.234317e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.712009e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.753408e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.538387e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.219278e+00 s Time to initialize coeftab 1.018500e+01 s Test #3220: mpi_dst_example_simple_lap_d_facto2_sched4_not_tqrcpend .................***Timeout 526.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.753948e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.162318e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.115097e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.519928e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.930069e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.514366e+00 s Time to initialize coeftab 2.128364e+00 s Time to factorize 7.260936e+01 s (140.81 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Test #3221: mpi_dst_example_simple_lap_d_facto2_sched4_kway_tqrcpbegin ..............***Timeout 526.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.104168e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.978764e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.571744e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.954071e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.519432e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.534341e+00 s Time to initialize coeftab 1.208403e+01 s Test #3222: mpi_dst_example_simple_lap_d_facto2_sched4_kway_tqrcpend ................***Timeout 526.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.645468e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.782588e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.634958e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.563405e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.362596e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.295089e+00 s Time to initialize coeftab 1.449297e+00 s Time to factorize 6.506505e+01 s (157.14 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3223: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpbegin ...***Timeout 525.87 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.857492e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.287430e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.306812e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.713834e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.151566e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.060164e+01 s Time to initialize coeftab 1.073305e+01 s Test #3224: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpend .....***Timeout 525.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.541267e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.858611e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.216796e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.733132e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.862207e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.852588e+00 s Time to initialize coeftab 1.322820e+00 s Time to factorize 5.685504e+01 s (179.83 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3225: mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrrtbegin ...............***Timeout 525.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.616027e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.146756e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.978116e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.087639e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.495667e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.959415e+00 s Time to initialize coeftab 4.435829e+01 s Test #3227: mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrrtbegin ..............***Timeout 525.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.746923e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.986043e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.517229e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.329417e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.763759e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.216877e+00 s Time to initialize coeftab 1.013859e+01 s Test #3229: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtbegin ...***Timeout 525.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.608839e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.187728e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.919633e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.912164e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.525054e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.013174e+01 s Time to initialize coeftab 1.138549e+01 s Test #3230: mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtend .....***Timeout 525.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.963468e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.473560e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.714212e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.110977e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.746362e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.030865e+01 s Time to initialize coeftab 1.466274e+00 s Time to factorize 4.993705e+01 s (204.74 KFlop/s) Number of operations 8.17 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko Test #3231: mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpilu0 ...............***Timeout 525.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.316846e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.818822e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.738642e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.934462e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.972665e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.598401e+00 s Time to initialize coeftab 1.785981e+00 s Test #3234: mpi_dst_example_simple_lap_c_facto0_sched4_not_svdend ...................***Timeout 523.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.443124e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.619416e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.937708e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.031357e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.482266e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.078739e+01 s Time to initialize coeftab 1.181006e+00 s Test #3240: mpi_dst_example_simple_lap_c_facto0_sched4_not_pqrcpend .................***Timeout 479.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.561469e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.416471e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.533216e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.828489e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.207957e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.241215e+00 s Time to initialize coeftab 1.940006e+00 s Test #3242: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpend ................***Timeout 479.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.471392e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.112025e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.170084e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.008960e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.244118e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.059835e+00 s Time to initialize coeftab 1.503564e+00 s Test #3243: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpbegin ...***Timeout 479.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.481636e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.586353e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.311095e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.193497e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.438530e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.606065e+00 s Time to initialize coeftab 6.091953e+00 s Test #3244: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpend .....***Timeout 478.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.709456e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.698284e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.655653e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.018394e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.589626e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.104380e+00 s Time to initialize coeftab 3.324014e+00 s Test #3246: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrcpend .................***Timeout 476.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.999231e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.510065e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.021454e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.220392e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.611245e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.123376e+00 s Time to initialize coeftab 2.285002e+00 s Test #3248: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrcpend ................***Timeout 475.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.241430e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.907485e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.969594e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.179850e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.719802e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.259545e+00 s Time to initialize coeftab 1.278184e+00 s Test #3249: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpbegin ...***Timeout 475.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.651194e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.340688e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.225659e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.275338e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.853806e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.884578e+00 s Time to initialize coeftab 1.388521e+01 s Test #3250: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpend .....***Timeout 475.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : 1: 300 1140 2: 200 760 3: 200 660 Ordering method is: Scotch Time to compute ordering 2.433360e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.135530e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.253524e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.222924e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.895845e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.022142e+01 s Time to initialize coeftab 1.930943e+00 s Test #3254: mpi_dst_example_simple_lap_c_facto0_sched4_kway_tqrcpend ................***Timeout 473.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.826738e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.758099e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.925745e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.100979e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.847321e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.118337e+01 s Time to initialize coeftab 1.759959e+00 s Test #3256: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpend .....***Timeout 473.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.695129e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.833110e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.871179e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.052342e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.871108e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.022576e+01 s Time to initialize coeftab 1.935967e+00 s Test #3368: mpi_dst_example_simple_lap_c_facto4_sched4_not_pqrcpend .................***Timeout 443.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.912729e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.780126e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.049238e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.891171e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.625639e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 9.741138e+00 s Time to initialize coeftab 2.209410e+00 s 3246/3626 Test #3510: mpi_dst_example_simple_lap_z_facto3_sched4_kway_tqrcpend ................***Timeout 381.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3510: mpi_dst_example_simple_lap_z_facto3_sched4_kway_tqrcpend 3246/3626 Test #3511: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpbegin ...***Timeout 381.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3511: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpbegin 3246/3626 Test #3512: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpend .....***Timeout 381.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3512: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpend 3246/3626 Test #3513: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrrtbegin ...............***Timeout 379.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3513: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrrtbegin 3246/3626 Test #3514: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrrtend .................***Timeout 379.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Start 3514: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrrtend 3246/3626 Test #3515: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrrtbegin ..............***Timeout 379.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3515: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrrtbegin 3246/3626 Test #3516: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrrtend ................***Timeout 380.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Start 3516: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrrtend 3246/3626 Test #3517: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtbegin ...***Timeout 378.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3517: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtbegin 3246/3626 Test #3518: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtend .....***Timeout 378.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3518: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtend 3246/3626 Test #3519: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpilu0 ...............***Timeout 378.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3519: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpilu0 3246/3626 Test #3520: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpilu1 ...............***Timeout 378.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3520: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpilu1 3246/3626 Test #3521: mpi_dst_example_simple_lap_z_facto4_sched4_not_svdbegin .................***Timeout 378.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started ischedInit: The thread number has been automatically set to 256 PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3521: mpi_dst_example_simple_lap_z_facto4_sched4_not_svdbegin 3246/3626 Test #3522: mpi_dst_example_simple_lap_z_facto4_sched4_not_svdend ...................***Timeout 378.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3522: mpi_dst_example_simple_lap_z_facto4_sched4_not_svdend 3246/3626 Test #3523: mpi_dst_example_simple_lap_z_facto4_sched4_kway_svdbegin ................***Timeout 376.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3523: mpi_dst_example_simple_lap_z_facto4_sched4_kway_svdbegin 3246/3626 Test #3524: mpi_dst_example_simple_lap_z_facto4_sched4_kway_svdend ..................***Timeout 374.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch 1: 300 1140 2: 200 760 3: 200 660 Start 3524: mpi_dst_example_simple_lap_z_facto4_sched4_kway_svdend 3246/3626 Test #3525: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_svdbegin .....***Timeout 374.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3525: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_svdbegin 3246/3626 Test #3526: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_svdend .......***Timeout 375.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3526: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_svdend 3246/3626 Test #3527: mpi_dst_example_simple_lap_z_facto4_sched4_not_pqrcpbegin ...............***Timeout 377.38 sec Start 3527: mpi_dst_example_simple_lap_z_facto4_sched4_not_pqrcpbegin 3246/3626 Test #3528: mpi_dst_example_simple_lap_z_facto4_sched4_not_pqrcpend .................***Timeout 378.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3528: mpi_dst_example_simple_lap_z_facto4_sched4_not_pqrcpend 3246/3626 Test #3529: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpbegin ..............***Timeout 379.01 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3529: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpbegin 3246/3626 Test #3530: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpend ................***Timeout 379.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 Start 3530: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpend 3246/3626 Test #3531: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpbegin ...***Timeout 379.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3531: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpbegin 3246/3626 Test #3532: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpend .....***Timeout 379.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Start 3532: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpend 3246/3626 Test #3533: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrcpbegin ...............***Timeout 379.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3533: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrcpbegin 3246/3626 Test #3534: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrcpend .................***Timeout 380.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP ischedInit: The thread number has been automatically set to 256 Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Start 3534: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrcpend 3246/3626 Test #3535: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrcpbegin ..............***Timeout 379.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3535: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrcpbegin 3246/3626 Test #3536: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrcpend ................***Timeout 380.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Start 3536: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrcpend 3246/3626 Test #3537: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpbegin ...***Timeout 381.11 sec Start 3537: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpbegin 3246/3626 Test #3538: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpend .....***Timeout 381.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3538: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpend 3246/3626 Test #3539: mpi_dst_example_simple_lap_z_facto4_sched4_not_tqrcpbegin ...............***Timeout 381.59 sec Start 3539: mpi_dst_example_simple_lap_z_facto4_sched4_not_tqrcpbegin 3246/3626 Test #3540: mpi_dst_example_simple_lap_z_facto4_sched4_not_tqrcpend .................***Timeout 382.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3540: mpi_dst_example_simple_lap_z_facto4_sched4_not_tqrcpend 3246/3626 Test #3541: mpi_dst_example_simple_lap_z_facto4_sched4_kway_tqrcpbegin ..............***Timeout 382.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3541: mpi_dst_example_simple_lap_z_facto4_sched4_kway_tqrcpbegin 3246/3626 Test #3542: mpi_dst_example_simple_lap_z_facto4_sched4_kway_tqrcpend ................***Timeout 383.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3542: mpi_dst_example_simple_lap_z_facto4_sched4_kway_tqrcpend 3246/3626 Test #3543: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpbegin ...***Timeout 383.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3543: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpbegin 3246/3626 Test #3544: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpend .....***Timeout 382.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3544: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpend 3246/3626 Test #3545: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrrtbegin ...............***Timeout 382.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3545: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrrtbegin 3246/3626 Test #3546: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrrtend .................***Timeout 383.88 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3546: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrrtend 3246/3626 Test #3547: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrrtbegin ..............***Timeout 384.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3547: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrrtbegin 3246/3626 Test #3548: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrrtend ................***Timeout 384.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 Start 3548: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrrtend 3246/3626 Test #3549: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtbegin ...***Timeout 384.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Start 3549: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtbegin 3246/3626 Test #3550: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtend .....***Timeout 383.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3550: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtend 3246/3626 Test #3551: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpilu0 ...............***Timeout 382.59 sec Start 3551: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpilu0 3246/3626 Test #3552: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpilu1 ...............***Timeout 380.66 sec Start 3552: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpilu1 3246/3626 Test #3565: bcsc_shm_test_bcsc_spmv_time_rsa ........................................***Timeout 378.74 sec RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 Start 3565: bcsc_shm_test_bcsc_spmv_time_rsa 3246/3626 Test #3570: bcsc_shm_test_bvec_tests ................................................***Timeout 378.68 sec ischedInit: The thread number has been automatically set to 256 Start 3570: bcsc_shm_test_bvec_tests 3246/3626 Test #3572: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_s .................................***Timeout 378.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 3572: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_s 3246/3626 Test #3573: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_d .................................***Timeout 378.48 sec Start 3573: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_d 3246/3626 Test #3574: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_c .................................***Timeout 379.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 3574: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_c 3246/3626 Test #3575: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_z .................................***Timeout 379.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3575: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_z 3246/3626 Test #3576: bcsc_mpi_rep_test_bcsc_spmv_tests_rsa ...................................***Timeout 379.09 sec RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver Start 3576: bcsc_mpi_rep_test_bcsc_spmv_tests_rsa 3246/3626 Test #3577: bcsc_mpi_rep_test_bcsc_spmv_tests_mm ....................................***Timeout 378.32 sec Start 3577: bcsc_mpi_rep_test_bcsc_spmv_tests_mm 3246/3626 Test #3579: bcsc_mpi_rep_test_bcsc_spmv_tests_mm2 ...................................***Timeout 378.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3579: bcsc_mpi_rep_test_bcsc_spmv_tests_mm2 3246/3626 Test #3580: bcsc_mpi_rep_test_bcsc_spmv_time_lap_s ..................................***Timeout 379.96 sec Start 3580: bcsc_mpi_rep_test_bcsc_spmv_time_lap_s 3246/3626 Test #3581: bcsc_mpi_rep_test_bcsc_spmv_time_lap_d ..................................***Timeout 380.73 sec Start 3581: bcsc_mpi_rep_test_bcsc_spmv_time_lap_d 3246/3626 Test #3582: bcsc_mpi_rep_test_bcsc_spmv_time_lap_c ..................................***Timeout 381.13 sec Start 3582: bcsc_mpi_rep_test_bcsc_spmv_time_lap_c 3246/3626 Test #3583: bcsc_mpi_rep_test_bcsc_spmv_time_lap_z ..................................***Timeout 382.23 sec Start 3583: bcsc_mpi_rep_test_bcsc_spmv_time_lap_z 3246/3626 Test #3584: bcsc_mpi_rep_test_bcsc_spmv_time_rsa ....................................***Timeout 382.88 sec RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver Start 3584: bcsc_mpi_rep_test_bcsc_spmv_time_rsa 3246/3626 Test #3585: bcsc_mpi_rep_test_bcsc_spmv_time_mm .....................................***Timeout 383.87 sec Start 3585: bcsc_mpi_rep_test_bcsc_spmv_time_mm 3246/3626 Test #3586: bcsc_mpi_rep_test_bcsc_spmv_time_hb .....................................***Timeout 384.49 sec Start 3586: bcsc_mpi_rep_test_bcsc_spmv_time_hb 3246/3626 Test #3587: bcsc_mpi_rep_test_bcsc_spmv_time_mm2 ....................................***Timeout 383.96 sec Start 3587: bcsc_mpi_rep_test_bcsc_spmv_time_mm2 3246/3626 Test #3589: bcsc_mpi_rep_test_bvec_tests ............................................***Timeout 382.79 sec Start 3589: bcsc_mpi_rep_test_bvec_tests 3246/3626 Test #3591: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_s .................................***Timeout 384.31 sec Start 3591: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_s 3246/3626 Test #3592: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_d .................................***Timeout 385.41 sec Start 3592: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_d 3246/3626 Test #3593: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_c .................................***Timeout 385.83 sec Start 3593: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_c 3246/3626 Test #3594: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_z .................................***Timeout 386.25 sec Start 3594: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_z 3246/3626 Test #3595: bcsc_mpi_dst_test_bcsc_spmv_tests_rsa ...................................***Timeout 384.45 sec RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver Start 3595: bcsc_mpi_dst_test_bcsc_spmv_tests_rsa 3246/3626 Test #3596: bcsc_mpi_dst_test_bcsc_spmv_tests_mm ....................................***Timeout 384.47 sec Start 3596: bcsc_mpi_dst_test_bcsc_spmv_tests_mm 3246/3626 Test #3598: bcsc_mpi_dst_test_bcsc_spmv_tests_mm2 ...................................***Timeout 383.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Start 3598: bcsc_mpi_dst_test_bcsc_spmv_tests_mm2 3246/3626 Test #3599: bcsc_mpi_dst_test_bcsc_spmv_time_lap_s ..................................***Timeout 381.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3599: bcsc_mpi_dst_test_bcsc_spmv_time_lap_s 3246/3626 Test #3600: bcsc_mpi_dst_test_bcsc_spmv_time_lap_d ..................................***Timeout 382.29 sec Start 3600: bcsc_mpi_dst_test_bcsc_spmv_time_lap_d 3246/3626 Test #3601: bcsc_mpi_dst_test_bcsc_spmv_time_lap_c ..................................***Timeout 381.04 sec Start 3601: bcsc_mpi_dst_test_bcsc_spmv_time_lap_c 3246/3626 Test #3602: bcsc_mpi_dst_test_bcsc_spmv_time_lap_z ..................................***Timeout 379.90 sec Start 3602: bcsc_mpi_dst_test_bcsc_spmv_time_lap_z 3246/3626 Test #3603: bcsc_mpi_dst_test_bcsc_spmv_time_rsa ....................................***Timeout 378.88 sec Start 3603: bcsc_mpi_dst_test_bcsc_spmv_time_rsa 3246/3626 Test #3604: bcsc_mpi_dst_test_bcsc_spmv_time_mm .....................................***Timeout 379.20 sec Start 3604: bcsc_mpi_dst_test_bcsc_spmv_time_mm 3246/3626 Test #3605: bcsc_mpi_dst_test_bcsc_spmv_time_hb .....................................***Timeout 377.95 sec Start 3605: bcsc_mpi_dst_test_bcsc_spmv_time_hb 3246/3626 Test #3606: bcsc_mpi_dst_test_bcsc_spmv_time_mm2 ....................................***Timeout 377.59 sec Start 3606: bcsc_mpi_dst_test_bcsc_spmv_time_mm2 3246/3626 Test #3607: bcsc_mpi_dst_test_bvec_tests ............................................***Timeout 377.59 sec Start 3607: bcsc_mpi_dst_test_bvec_tests 3246/3626 Test #3608: bcsc_mpi_dst_test_bvec_applyorder_tests .................................***Timeout 377.59 sec Start 3608: bcsc_mpi_dst_test_bvec_applyorder_tests 3246/3626 Test #3610: fortran_mpi_fsimple .....................................................***Timeout 377.96 sec Start 3610: fortran_mpi_fsimple 3246/3626 Test #3612: fortran_mpi_flaplacian ..................................................***Timeout 374.07 sec Start 3612: fortran_mpi_flaplacian 3246/3626 Test #3614: fortran_mpi_fstep-by-step ...............................................***Timeout 371.91 sec Start 3614: fortran_mpi_fstep-by-step 3246/3626 Test #3616: fortran_mpi_fmultidof ...................................................***Timeout 372.78 sec Start 3616: fortran_mpi_fmultidof 3246/3626 Test #3618: fortran_mpi_fusermat_csr ................................................***Timeout 372.39 sec Start 3618: fortran_mpi_fusermat_csr 3246/3626 Test #3619: fortran_shm_fmultilap_seq ...............................................***Timeout 373.40 sec !--------------------------------------------------------------------! ! Multiple Laplacian testing configuration ! !--------------------------------------------------------------------! Nb of threads = 5 Nb of PaStiX instances = 1 Nb of outer iterations = 2 Nb of distinct matrices = 2 Nb of RHS to solve per matrix = 10 Nb of solve phase to perform per RHS = 2 Size of each matrix = 1000 ( 10 x 10 x 10 ) Nbr of non zero entries per matrix = 3700 The multirhs mode is disabled !--------------------------------------------------------------------! Size of x = 1000 Matrix NRHS Start End 1 2 1 2 1 2 3 4 1 2 5 6 1 2 7 8 1 2 9 10 2 2 1 2 2 2 3 4 2 2 5 6 2 2 7 8 2 2 9 10 Start 3619: fortran_shm_fmultilap_seq 3246/3626 Test #3620: fortran_shm_fmultilap_mt ................................................***Timeout 373.93 sec Start 3620: fortran_shm_fmultilap_mt 3246/3626 Test #3622: python_mpi_simple .......................................................***Timeout 374.94 sec Start 3622: python_mpi_simple 3246/3626 Test #3624: python_mpi_step-by-step .................................................***Timeout 375.68 sec Start 3624: python_mpi_step-by-step 3246/3626 Test #3626: python_mpi_simple_obj ...................................................***Timeout 376.78 sec Start 3626: python_mpi_simple_obj 3246/3626 Test #3613: fortran_shm_fstep-by-step ...............................................***Timeout 381.68 sec Start 3613: fortran_shm_fstep-by-step 3246/3626 Test #3615: fortran_shm_fmultidof ...................................................***Timeout 380.28 sec Start 3615: fortran_shm_fmultidof 3246/3626 Test #3623: python_shm_step-by-step .................................................***Timeout 379.41 sec Start 3623: python_shm_step-by-step Test #3236: mpi_dst_example_simple_lap_c_facto0_sched4_kway_svdend ..................***Timeout 396.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.216925e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.212966e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.172959e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.698593e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.458342e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.797663e+00 s Time to initialize coeftab 1.052057e+00 s Time to factorize 1.722031e+02 s (120.60 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3260: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrrtend ................***Timeout 389.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.851248e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.400074e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.092981e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.667180e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.153919e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.018460e+00 s Time to initialize coeftab 1.910490e+00 s Test #3304: mpi_dst_example_simple_lap_c_facto2_sched4_not_pqrcpend .................***Timeout 383.98 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.452314e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.258578e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.302022e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.569678e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.556754e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.499758e+00 s Time to initialize coeftab 3.955992e+00 s Time to factorize 1.042309e+02 s (392.68 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3306: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpend ................***Timeout 382.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.205821e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.708351e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.098097e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.731038e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.447175e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.008690e+01 s Time to initialize coeftab 2.282317e+00 s Time to factorize 8.864212e+01 s (461.74 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3308: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpend .....***Timeout 381.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.129729e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.318700e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.405188e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.027791e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.877892e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.208681e+00 s Time to initialize coeftab 2.017118e+00 s Time to factorize 9.486644e+01 s (431.44 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3314: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpend .....***Timeout 378.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.501908e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.654430e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.674277e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.770771e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.968536e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.750747e-02 s Time to initialize coeftab 1.007505e+00 s Time to factorize 1.373553e+02 s (297.98 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3318: mpi_dst_example_simple_lap_c_facto2_sched4_kway_tqrcpend ................***Timeout 373.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.009097e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.707145e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.569324e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.331732e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.647071e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.196781e+00 s Time to initialize coeftab 1.975479e+00 s Time to factorize 1.510649e+02 s (270.94 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3336: mpi_dst_example_simple_lap_c_facto3_sched4_not_pqrcpend .................***Timeout 379.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.538012e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.151502e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.368411e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.418661e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.497918e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.474647e+00 s Time to initialize coeftab 1.303101e+00 s Time to factorize 9.681939e+01 s (214.50 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Test #3338: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpend ................***Timeout 376.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.609580e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.793265e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.517897e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.339029e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.637345e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.932834e+00 s Time to initialize coeftab 1.652315e+00 s Time to factorize 5.237913e+01 s (396.49 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.638905e+01 s Time for refinement 1.007310e+01 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822264e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.930354e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.930354e-07 max(|| b_i - A x_i ||_1) 8.766829e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.212176e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.930354e-07 max(|| b_i - A x_i ||_1) 8.766829e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.212176e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.930354e-07 max(|| b_i - A x_i ||_1) 8.766829e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.212176e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.766829e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.212176e+00 (SUCCESS) Test #3342: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrcpend .................***Timeout 374.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.672877e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.944154e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.274635e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.219193e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.999565e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.939046e+00 s Time to initialize coeftab 1.956436e+00 s Time to factorize 1.360631e+02 s (152.63 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3346: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpend .....***Timeout 373.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.062575e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.442309e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.441546e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.835969e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.036115e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.960419e+00 s Time to initialize coeftab 1.640483e+00 s Time to factorize 1.200895e+02 s (172.93 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Test #3350: mpi_dst_example_simple_lap_c_facto3_sched4_kway_tqrcpend ................***Timeout 372.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.851343e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.912542e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.623454e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.403561e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.486190e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.080225e+00 s Time to initialize coeftab 1.205331e+00 s Time to factorize 1.215113e+02 s (170.91 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3354: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrrtend .................***Timeout 373.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.674308e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.312944e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.146546e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.693483e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.448772e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.717195e+00 s Time to initialize coeftab 1.555753e+00 s Time to factorize 1.334794e+02 s (155.59 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3356: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrrtend ................***Timeout 372.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.898771e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.003748e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.936421e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.174788e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.341039e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.049797e-01 s Time to initialize coeftab 6.553227e-01 s Time to factorize 1.984867e+02 s (104.63 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3358: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtend .....***Timeout 372.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.942865e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.631119e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.050806e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.688626e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.256306e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.829325e+00 s Time to initialize coeftab 1.998834e+00 s Time to factorize 5.516900e+01 s (376.44 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3372: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpend .....***Timeout 372.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.174561e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.630501e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.631755e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.255867e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.946309e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.369484e+00 s Time to initialize coeftab 1.416087e+00 s Test #3376: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrcpend ................***Timeout 371.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.194522e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.684510e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.148066e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.240817e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.057937e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.567054e+00 s Time to initialize coeftab 1.648360e+00 s Time to factorize 1.635899e+02 s (133.38 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko Test #3380: mpi_dst_example_simple_lap_c_facto4_sched4_not_tqrcpend .................***Timeout 368.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.947479e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.846054e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.768915e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.249324e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.498269e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 8.939772e+00 s Time to initialize coeftab 1.769858e+00 s Time to factorize 9.599826e+01 s (227.29 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3386: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrrtend .................***Timeout 367.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.958523e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.284669e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.864130e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.459944e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.826376e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.874520e+00 s Time to initialize coeftab 1.111907e+00 s Time to factorize 6.541605e+01 s (333.55 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.109445e+01 s Time for refinement 8.575653e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.017009e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.017009e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.017009e-07 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.017009e-07 max(|| b_i - A x_i ||_1) 8.272669e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.087482e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.272669e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.087482e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.272669e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.087482e+00 (SUCCESS) max(|| b_i - A x_i ||_1) 8.272669e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.087482e+00 (SUCCESS) Test #3388: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrrtend ................***Timeout 367.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.889021e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.900231e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.606149e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.743961e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.265285e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 7.404778e+00 s Time to initialize coeftab 1.315430e+00 s Time to factorize 5.637523e+01 s (387.04 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3390: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtend .....***Timeout 367.42 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.025454e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.098989e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.631219e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.425794e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.644084e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.137679e+00 s Time to initialize coeftab 2.623988e+00 s Test #3402: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpend ................***Timeout 366.98 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.846975e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.790096e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.095775e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.707664e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.589309e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.391342e+00 s Time to initialize coeftab 1.882609e+00 s Time to factorize 5.194596e+01 s (399.79 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Test #3404: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpend .....***Timeout 366.97 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.401974e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.279244e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.913380e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.433827e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.955035e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.997936e+00 s Time to initialize coeftab 4.949115e+00 s Time to factorize 9.369714e+01 s (221.65 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Test #3442: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpend .....***Timeout 361.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.018907e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.584760e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.596035e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.087383e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.266898e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.989318e+00 s Time to initialize coeftab 1.485379e+00 s Time to factorize 1.788229e+02 s (122.02 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Start 3442: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpend Test #3448: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpend .....***Timeout 362.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.821291e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.900326e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.097648e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.712359e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.653099e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.293282e+00 s Time to initialize coeftab 1.500122e+00 s Time to factorize 9.054173e+01 s (240.99 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 3448: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpend Test #3450: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrrtend .................***Timeout 360.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.246925e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.669679e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.664415e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.081405e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.957070e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.237113e+00 s Time to initialize coeftab 1.717742e+00 s Time to factorize 1.040498e+02 s (209.70 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko Start 3450: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrrtend Test #3464: mpi_dst_example_simple_lap_z_facto2_sched4_not_pqrcpend .................***Timeout 359.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.944017e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.409305e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.146548e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.020114e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.757433e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.905059e+00 s Time to initialize coeftab 1.938696e+00 s Time to factorize 9.230818e+01 s (443.40 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Start 3464: mpi_dst_example_simple_lap_z_facto2_sched4_not_pqrcpend Test #3470: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrcpend .................***Timeout 360.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.652990e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.487712e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.440543e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.860995e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.988379e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.039845e+00 s Time to initialize coeftab 8.522194e-01 s Time to factorize 1.398981e+02 s (292.57 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Start 3470: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrcpend Test #3491: mpi_dst_example_simple_lap_z_facto3_sched4_kway_svdbegin ................***Timeout 360.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 [arch-nspawn-3655178:925930] *** Process received signal *** [arch-nspawn-3655178:925930] Signal: Segmentation fault (11) [arch-nspawn-3655178:925930] Signal code: Address not mapped (1) [arch-nspawn-3655178:925930] Failing at address: 0xfffffff8 [arch-nspawn-3655178:925930] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7fb5ea0b66cc] [arch-nspawn-3655178:925930] [ 1] /usr/lib/libopen-pal.so.80(+0x744f2) [0x7fb5e835a4f2] [arch-nspawn-3655178:925930] [ 2] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7fb5e830ba7a] [arch-nspawn-3655178:925930] [ 3] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7fb5e8338aa2] [arch-nspawn-3655178:925930] [ 4] /usr/lib/libmpi.so.40(+0x7de1a) [0x7fb5e867de1a] [arch-nspawn-3655178:925930] [ 5] /usr/lib/libmpi.so.40(ompi_request_default_wait+0x1a) [0x7fb5e868019c] [arch-nspawn-3655178:925930] [ 6] /usr/lib/libmpi.so.40(ompi_coll_base_sendrecv_actual+0x98) [0x7fb5e86f03e8] [arch-nspawn-3655178:925930] [ 7] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_recursivedoubling+0x210) [0x7fb5e86f1a88] [arch-nspawn-3655178:925930] [ 8] /usr/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_ring+0x3fc) [0x7fb5e86f443c] [arch-nspawn-3655178:925930] [ 9] /usr/lib/libmpi.so.40(ompi_coll_tuned_allreduce_intra_dec_fixed+0x40) [0x7fb5e8715152] [arch-nspawn-3655178:925930] [10] /usr/lib/libmpi.so.40(MPI_Allreduce+0x294) [0x7fb5e868e584] [arch-nspawn-3655178:925930] [11] /build/pastix/src/build/spm/src/libspm.so.1(spmUpdateComputedFields+0x140) [0x7fb5e8abe458] [arch-nspawn-3655178:925930] [12] /build/pastix/src/build/spm/src/libspm.so.1(genLaplacian+0xaa) [0x7fb5e8ac721e] [arch-nspawn-3655178:925930] [13] /build/pastix/src/build/spm/src/libspm.so.1(+0x409c8) [0x7fb5e8ac89c8] [arch-nspawn-3655178:925930] [14] ./simple(+0xe2c) [0x555555556e2c] [arch-nspawn-3655178:925930] [15] /usr/lib/libc.so.6(+0x27fae) [0x7fb5e892cfae] [arch-nspawn-3655178:925930] [16] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7fb5e892d0b8] [arch-nspawn-3655178:925930] [17] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:925930] *** End of error message *** -------------------------------------------------------------------------- prte noticed that process rank 3 with PID 925930 on node arch-nspawn-3655178 exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- Start 3491: mpi_dst_example_simple_lap_z_facto3_sched4_kway_svdbegin Test #3498: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpend ................***Timeout 362.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.130393e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.719219e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.016186e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.952445e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.399545e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.225097e+00 s Time to initialize coeftab 1.635889e+00 s Time to factorize 5.387295e+01 s (385.49 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 3498: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpend Test #3272: mpi_dst_example_simple_lap_c_facto1_sched4_not_pqrcpend .................***Timeout 259.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.068943e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.246701e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.167326e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.637999e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.438200e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.861869e+00 s Time to initialize coeftab 2.532525e+00 s Test #3274: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpend ................***Timeout 258.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.820636e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.627684e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.651909e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.872957e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.473990e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.996716e+00 s Time to initialize coeftab 1.710983e+00 s Test #3553: bcsc_shm_test_bcsc_spmv_tests_lap_s .....................................***Timeout 138.13 sec ischedInit: The thread number has been automatically set to 256 Start 3553: bcsc_shm_test_bcsc_spmv_tests_lap_s Test #3554: bcsc_shm_test_bcsc_spmv_tests_lap_d .....................................***Timeout 138.14 sec ischedInit: The thread number has been automatically set to 256 Start 3554: bcsc_shm_test_bcsc_spmv_tests_lap_d Test #3555: bcsc_shm_test_bcsc_spmv_tests_lap_c .....................................***Timeout 138.14 sec ischedInit: The thread number has been automatically set to 256 Start 3555: bcsc_shm_test_bcsc_spmv_tests_lap_c Test #3556: bcsc_shm_test_bcsc_spmv_tests_lap_z .....................................***Timeout 138.14 sec ischedInit: The thread number has been automatically set to 256 Start 3556: bcsc_shm_test_bcsc_spmv_tests_lap_z Test #3557: bcsc_shm_test_bcsc_spmv_tests_rsa .......................................***Timeout 138.84 sec RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 Start 3557: bcsc_shm_test_bcsc_spmv_tests_rsa Test #3558: bcsc_shm_test_bcsc_spmv_tests_mm ........................................***Timeout 138.84 sec ischedInit: The thread number has been automatically set to 256 Start 3558: bcsc_shm_test_bcsc_spmv_tests_mm Test #3559: bcsc_shm_test_bcsc_spmv_tests_hb ........................................***Timeout 140.55 sec ischedInit: The thread number has been automatically set to 256 Start 3559: bcsc_shm_test_bcsc_spmv_tests_hb Test #3560: bcsc_shm_test_bcsc_spmv_tests_mm2 .......................................***Timeout 141.32 sec ischedInit: The thread number has been automatically set to 256 Start 3560: bcsc_shm_test_bcsc_spmv_tests_mm2 Test #3561: bcsc_shm_test_bcsc_spmv_time_lap_s ......................................***Timeout 141.86 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 3561: bcsc_shm_test_bcsc_spmv_time_lap_s Test #3562: bcsc_shm_test_bcsc_spmv_time_lap_d ......................................***Timeout 142.59 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 3562: bcsc_shm_test_bcsc_spmv_time_lap_d Test #3563: bcsc_shm_test_bcsc_spmv_time_lap_c ......................................***Timeout 144.06 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 3563: bcsc_shm_test_bcsc_spmv_time_lap_c Test #3564: bcsc_shm_test_bcsc_spmv_time_lap_z ......................................***Timeout 144.10 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 3564: bcsc_shm_test_bcsc_spmv_time_lap_z Test #3566: bcsc_shm_test_bcsc_spmv_time_mm .........................................***Timeout 144.77 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Complex64 Format: IJV N: 841 nnz: 2465 Start 3566: bcsc_shm_test_bcsc_spmv_time_mm Test #3567: bcsc_shm_test_bcsc_spmv_time_hb .........................................***Timeout 147.26 sec ischedInit: The thread number has been automatically set to 256 Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 Start 3567: bcsc_shm_test_bcsc_spmv_time_hb Test #3568: bcsc_shm_test_bcsc_spmv_time_mm2 ........................................***Timeout 148.03 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: IJV N: 1280 nnz: 12029 Start 3568: bcsc_shm_test_bcsc_spmv_time_mm2 Test #3569: bcsc_shm_test_bvec_gemv_tests ...........................................***Timeout 148.03 sec ischedInit: The thread number has been automatically set to 256 Case Float - Sequential: SUCCESS Case Double - Sequential: SUCCESS Case Complex32 - Sequential: SUCCESS Case Complex64 - Sequential: SUCCESS Case Float - Static: SUCCESS Case Double - Static: SUCCESS Case Complex32 - Static: SUCCESS Case Complex64 - Static: SUCCESS -- All tests PASSED -- Start 3569: bcsc_shm_test_bvec_gemv_tests Test #3571: bcsc_shm_test_bvec_applyorder_tests .....................................***Timeout 147.91 sec ischedInit: The thread number has been automatically set to 256 Check b == (P^t (P b)) / Case Single DoF - Replicated - Float - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Replicated - Double - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Replicated - Complex32 - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Replicated - Complex64 - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Replicated - Float - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Replicated - Double - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Replicated - Complex32 - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Replicated - Complex64 - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Float - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Double - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Complex32 - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Complex64 - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Float - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Double - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Complex32 - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Complex64 - 5 nrhs: SUCCESS -- All tests PASSED -- Start 3571: bcsc_shm_test_bvec_applyorder_tests Test #3590: bcsc_mpi_rep_test_bvec_applyorder_tests .................................***Timeout 145.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3590: bcsc_mpi_rep_test_bvec_applyorder_tests Test #3597: bcsc_mpi_dst_test_bcsc_spmv_tests_hb ....................................***Timeout 143.38 sec Start 3597: bcsc_mpi_dst_test_bcsc_spmv_tests_hb Test #3609: fortran_shm_fsimple .....................................................***Timeout 138.93 sec Start 3609: fortran_shm_fsimple Test #3611: fortran_shm_flaplacian ..................................................***Timeout 138.82 sec Start 3611: fortran_shm_flaplacian Test #3617: fortran_shm_fusermat_csr ................................................***Timeout 136.99 sec Start 3617: fortran_shm_fusermat_csr Test #3233: mpi_dst_example_simple_lap_c_facto0_sched4_not_svdbegin .................***Timeout 460.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.069814e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.903594e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.342633e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.431237e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.398707e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.888504e+00 s Time to initialize coeftab 1.081601e+01 s Test #3235: mpi_dst_example_simple_lap_c_facto0_sched4_kway_svdbegin ................***Timeout 460.25 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.268985e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.925495e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.961755e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.775139e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.515243e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.528474e+00 s Time to initialize coeftab 1.098989e+01 s Test #3237: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_svdbegin .....***Timeout 459.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.409866e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.889197e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.414504e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.940235e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.640486e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.775617e+00 s Time to initialize coeftab 1.051261e+01 s Test #3238: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_svdend .......***Timeout 458.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.464006e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.165663e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.451803e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.484119e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.649933e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.278197e+00 s Time to initialize coeftab 1.401882e+00 s Time to factorize 2.322758e+02 s (89.41 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Test #3239: mpi_dst_example_simple_lap_c_facto0_sched4_not_pqrcpbegin ...............***Timeout 457.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.264741e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.938572e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.254244e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.622009e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.496681e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.040924e+00 s Time to initialize coeftab 4.856191e+00 s Time to factorize 2.191949e+02 s (94.75 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Test #3241: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpbegin ..............***Timeout 455.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.375354e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.732371e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.414124e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.840922e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.393910e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.048783e+00 s Time to initialize coeftab 5.039356e+00 s Test #3245: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrcpbegin ...............***Timeout 453.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.047122e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.346221e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.293101e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.373775e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.918606e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.168313e+00 s Time to initialize coeftab 1.652948e+01 s Test #3247: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrcpbegin ..............***Timeout 453.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.556279e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.198056e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.493831e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.253510e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.517252e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.610951e+00 s Time to initialize coeftab 8.324492e+00 s Test #3251: mpi_dst_example_simple_lap_c_facto0_sched4_not_tqrcpbegin ...............***Timeout 451.59 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.090037e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.083106e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.167148e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.272735e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.297579e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.092722e+01 s Time to initialize coeftab 1.849094e+01 s Test #3253: mpi_dst_example_simple_lap_c_facto0_sched4_kway_tqrcpbegin ..............***Timeout 450.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.490836e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.200223e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.254132e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.824493e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.461218e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.600768e+00 s Time to initialize coeftab 1.318822e+01 s Test #3255: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpbegin ...***Timeout 450.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.203836e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.151435e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.223013e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.242381e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.633250e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.274061e+00 s Time to initialize coeftab 1.058474e+01 s Test #3257: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrrtbegin ...............***Timeout 450.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.242648e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.616170e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.210115e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.629234e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.732282e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.300004e+00 s Time to initialize coeftab 1.759782e+01 s Test #3263: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpilu0 ...............***Timeout 443.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.316062e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.163911e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.105628e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.796348e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.429226e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 6.995487e+00 s Time to initialize coeftab 2.067169e+00 s Test #3298: mpi_dst_example_simple_lap_c_facto2_sched4_not_svdend ...................***Timeout 442.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.674843e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.054447e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.614743e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.982758e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.374195e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.354342e+00 s Time to initialize coeftab 1.462229e+00 s Test #3307: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpbegin ...***Timeout 438.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.674267e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.073748e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.592069e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.703885e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.047641e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.089467e+01 s Time to initialize coeftab 6.467745e+00 s Test #3309: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrcpbegin ...............***Timeout 436.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.767961e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.523084e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.278518e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.273397e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.819748e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.233926e+01 s Time to initialize coeftab 1.957979e+01 s Test #3310: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrcpend .................***Timeout 435.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.228008e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.669429e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.228925e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.296346e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.388940e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.959158e+00 s Time to initialize coeftab 1.767355e+01 s Time to factorize 1.570329e+02 s (260.64 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3311: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrcpbegin ..............***Timeout 433.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.667833e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.394732e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.579113e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.681386e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.174187e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.984840e+00 s Time to initialize coeftab 1.548401e+01 s Test #3312: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrcpend ................***Timeout 433.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.615606e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.749918e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.196968e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.102824e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.923617e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.092291e+00 s Time to initialize coeftab 2.655015e+00 s Time to factorize 1.426973e+02 s (286.83 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Test #3313: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpbegin ...***Timeout 433.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.433016e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.903950e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.028871e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.485749e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.202061e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.286635e+00 s Time to initialize coeftab 1.580807e+01 s Test #3315: mpi_dst_example_simple_lap_c_facto2_sched4_not_tqrcpbegin ...............***Timeout 430.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.755422e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.262813e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.745329e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 3.170030e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.617356e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.591342e-01 s Time to initialize coeftab 1.658310e+01 s Test #3316: mpi_dst_example_simple_lap_c_facto2_sched4_not_tqrcpend .................***Timeout 427.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.257636e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.188790e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.714072e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.597460e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.956668e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.351887e+00 s Time to initialize coeftab 1.171330e+00 s Time to factorize 1.592821e+02 s (256.96 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3317: mpi_dst_example_simple_lap_c_facto2_sched4_kway_tqrcpbegin ..............***Timeout 426.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.538736e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.318189e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.101867e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.804518e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.690687e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.763435e-02 s Time to initialize coeftab 1.649460e+01 s Test #3319: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpbegin ...***Timeout 426.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.656633e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.346026e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.239140e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.782133e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.340500e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.906234e+00 s Time to initialize coeftab 2.382307e+01 s Test #3320: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpend .....***Timeout 426.00 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.126231e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.679676e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.403286e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.853595e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.230004e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.875621e+00 s Time to initialize coeftab 1.553022e+00 s Time to factorize 1.591160e+02 s (257.23 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3321: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrrtbegin ...............***Timeout 425.99 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.303410e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.318749e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.681643e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.323523e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.721782e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.997450e+00 s Time to initialize coeftab 1.433306e+01 s Test #3322: mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrrtend .................***Timeout 425.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.196135e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.659298e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.056685e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.446284e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.434783e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.436373e+00 s Time to initialize coeftab 8.292192e-01 s Time to factorize 1.048116e+02 s (390.50 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3323: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrrtbegin ..............***Timeout 425.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.529588e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.674336e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.013518e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.700138e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.839243e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.976343e+00 s Time to initialize coeftab 1.170015e+01 s Test #3324: mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrrtend ................***Timeout 425.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.973982e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.255898e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.095562e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.859701e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.523478e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.575529e+00 s Time to initialize coeftab 1.972790e+00 s Time to factorize 1.331698e+02 s (307.35 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Test #3327: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpilu0 ...............***Timeout 425.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.663036e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.018609e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.849564e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.611901e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.772422e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.134866e+01 s Time to initialize coeftab 3.310104e+00 s Test #3328: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpilu1 ...............***Timeout 425.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.190279e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.916324e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.895857e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.824278e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.712499e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.529570e+00 s Time to initialize coeftab 1.833824e+00 s Test #3330: mpi_dst_example_simple_lap_c_facto3_sched4_not_svdend ...................***Timeout 425.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.005422e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.627401e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.955720e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.261504e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.979477e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.768840e+00 s Time to initialize coeftab 7.625764e+00 s Test #3334: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_svdend .......***Timeout 425.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.929371e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.295148e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.880932e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.817280e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.918056e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.416238e+00 s Time to initialize coeftab 2.452768e+00 s Test #3335: mpi_dst_example_simple_lap_c_facto3_sched4_not_pqrcpbegin ...............***Timeout 425.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.987569e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.855552e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.371297e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.002631e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.851966e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.074170e+01 s Time to initialize coeftab 6.580764e+00 s Test #3337: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpbegin ..............***Timeout 423.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.249591e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.282926e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.860501e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.168051e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.451963e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.808854e+00 s Time to initialize coeftab 3.264474e+00 s Time to factorize 2.179201e+02 s (95.30 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3339: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpbegin ...***Timeout 421.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.484256e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.726725e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.535760e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.238433e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.264108e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.107824e+01 s Time to initialize coeftab 6.406994e+00 s Test #3340: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpend .....***Timeout 420.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.127942e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.335326e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.561853e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.818889e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.248413e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.859571e+00 s Time to initialize coeftab 1.237987e+00 s Test #3341: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrcpbegin ...............***Timeout 420.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.916887e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.778141e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.347000e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.440419e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.392176e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.673573e+00 s Time to initialize coeftab 8.010667e+00 s Test #3344: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrcpend ................***Timeout 418.80 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.168334e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.724995e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.605856e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.632068e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.779133e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.437342e+00 s Time to initialize coeftab 2.808940e+00 s Test #3347: mpi_dst_example_simple_lap_c_facto3_sched4_not_tqrcpbegin ...............***Timeout 418.79 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.299924e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.737357e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.695302e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.471273e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.288303e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.478696e+00 s Time to initialize coeftab 9.764324e+00 s Test #3348: mpi_dst_example_simple_lap_c_facto3_sched4_not_tqrcpend .................***Timeout 418.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.765127e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.730324e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.429822e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.267911e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.373989e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.565584e+00 s Time to initialize coeftab 1.279907e+00 s Test #3352: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpend .....***Timeout 416.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.445240e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.041363e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.465059e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.779097e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.579683e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.737256e+00 s Time to initialize coeftab 1.600402e+00 s Time to factorize 1.398707e+02 s (148.48 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3353: mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrrtbegin ...............***Timeout 416.81 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.564134e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.043403e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.652211e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.885470e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.982774e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.401333e+00 s Time to initialize coeftab 1.469201e+01 s Test #3355: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrrtbegin ..............***Timeout 416.79 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.894275e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.946833e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.192144e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.895799e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.343554e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.697895e+00 s Time to initialize coeftab 1.126163e+01 s Test #3357: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtbegin ...***Timeout 414.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.858343e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.436758e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.990603e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.334151e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.751036e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.019524e+00 s Time to initialize coeftab 9.472468e+00 s Test #3359: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpilu0 ...............***Timeout 413.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.757320e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.002896e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.186507e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.655882e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.449953e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.026672e+01 s Time to initialize coeftab 1.684035e+00 s Test #3360: mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpilu1 ...............***Timeout 413.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.925801e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.965842e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.237168e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.861562e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.628404e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.001587e+01 s Time to initialize coeftab 1.811059e+00 s Test #3364: mpi_dst_example_simple_lap_c_facto4_sched4_kway_svdend ..................***Timeout 413.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.228523e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.236435e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.080479e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.955916e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.297765e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.111576e+01 s Time to initialize coeftab 2.201992e+00 s Test #3366: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_svdend .......***Timeout 413.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.109840e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.003656e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.110786e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.993482e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.082787e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.082521e+01 s Time to initialize coeftab 1.936687e+00 s Test #3370: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpend ................***Timeout 413.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.736820e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.714894e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.017313e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.499800e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.225442e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.912517e+00 s Time to initialize coeftab 1.570600e+00 s Test #3378: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpend .....***Timeout 410.84 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.248274e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.624143e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.331298e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.831955e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.708091e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.018436e+01 s Time to initialize coeftab 2.664281e+00 s Test #3382: mpi_dst_example_simple_lap_c_facto4_sched4_kway_tqrcpend ................***Timeout 407.79 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.074867e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.740573e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.223263e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.852863e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.059406e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 8.478938e+00 s Time to initialize coeftab 2.005586e+00 s Test #3384: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpend .....***Timeout 407.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.301195e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.532522e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.477440e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.816500e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.473161e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.536699e+00 s Time to initialize coeftab 1.309961e+00 s Test #3391: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpilu0 ...............***Timeout 407.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.073101e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.043418e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.157089e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.045227e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.769131e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.226972e+01 s Time to initialize coeftab 2.544282e+00 s Test #3398: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_svdend .......***Timeout 406.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.873054e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.272688e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.275439e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.019881e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.283712e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.412063e+00 s Time to initialize coeftab 2.138221e+00 s Test #3400: mpi_dst_example_simple_lap_z_facto0_sched4_not_pqrcpend .................***Timeout 406.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.654408e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.818914e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.241181e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.430664e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.988978e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.121809e+00 s Time to initialize coeftab 1.733815e+00 s Time to factorize 1.082070e+02 s (191.92 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Test #3406: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrcpend .................***Timeout 403.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.936744e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.975645e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.464225e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.656659e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.622169e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.788204e+00 s Time to initialize coeftab 1.973002e+00 s Test #3408: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrcpend ................***Timeout 400.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.012445e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.496409e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.687268e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.952740e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.995243e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.360794e+00 s Time to initialize coeftab 1.878039e+00 s Test #3410: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpend .....***Timeout 399.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.451599e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.967682e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.477550e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.673222e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.635432e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.288830e+01 s Time to initialize coeftab 2.125493e+00 s Test #3443: mpi_dst_example_simple_lap_z_facto1_sched4_not_tqrcpbegin ...............***Timeout 399.11 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.587900e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.025591e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.208331e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.346273e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.638914e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.381353e+00 s Time to initialize coeftab 1.914178e+01 s Start 3443: mpi_dst_example_simple_lap_z_facto1_sched4_not_tqrcpbegin Test #3444: mpi_dst_example_simple_lap_z_facto1_sched4_not_tqrcpend .................***Timeout 399.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.651850e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.684458e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.608003e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.521458e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.315201e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.179605e+00 s Time to initialize coeftab 1.549178e+00 s Time to factorize 1.421070e+02 s (153.54 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 3444: mpi_dst_example_simple_lap_z_facto1_sched4_not_tqrcpend Test #3445: mpi_dst_example_simple_lap_z_facto1_sched4_kway_tqrcpbegin ..............***Timeout 399.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.058201e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.340580e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.927614e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.931383e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.052395e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.037127e+01 s Time to initialize coeftab 1.751855e+01 s Start 3445: mpi_dst_example_simple_lap_z_facto1_sched4_kway_tqrcpbegin Test #3446: mpi_dst_example_simple_lap_z_facto1_sched4_kway_tqrcpend ................***Timeout 400.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.307541e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.867342e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.226699e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.787035e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.292319e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.651705e+00 s Time to initialize coeftab 1.149128e+00 s Time to factorize 1.677606e+02 s (130.06 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Start 3446: mpi_dst_example_simple_lap_z_facto1_sched4_kway_tqrcpend Test #3447: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpbegin ...***Timeout 400.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.557274e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.181204e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.770847e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.620278e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.199141e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.501528e+00 s Time to initialize coeftab 1.531258e+01 s Start 3447: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpbegin Test #3449: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrrtbegin ...............***Timeout 398.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.798377e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.272981e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.024491e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.275880e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.557270e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.411571e+00 s Time to initialize coeftab 1.430302e+01 s Start 3449: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrrtbegin Test #3451: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrrtbegin ..............***Timeout 394.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.184315e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.104293e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.634954e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.055379e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.216209e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.560660e+01 s Time to initialize coeftab 2.153000e+01 s Start 3451: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrrtbegin Test #3452: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrrtend ................***Timeout 393.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.693283e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.410383e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.321375e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.980862e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.439438e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.498991e+00 s Time to initialize coeftab 3.071223e+00 s Start 3452: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrrtend Test #3453: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtbegin ...***Timeout 393.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.869901e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.123688e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.552587e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.008008e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.381324e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.174404e+00 s Time to initialize coeftab 1.123271e+01 s Start 3453: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtbegin Test #3454: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtend .....***Timeout 393.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL ischedInit: The thread number has been automatically set to 256 GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.173198e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.868228e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.277368e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.614882e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.977076e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.153785e+01 s Time to initialize coeftab 1.638185e+00 s Start 3454: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtend Test #3455: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpilu0 ...............***Timeout 393.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 1: 300 1140 0: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.977914e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.304598e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.154065e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 2.028295e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.379468e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.997009e+00 s Time to initialize coeftab 2.860140e+00 s Start 3455: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpilu0 Test #3456: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpilu1 ...............***Timeout 392.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.019572e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.539067e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.460296e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.843581e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.029084e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 8.986195e+00 s Time to initialize coeftab 1.532194e+00 s Start 3456: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpilu1 Test #3457: mpi_dst_example_simple_lap_z_facto2_sched4_not_svdbegin .................***Timeout 391.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.235982e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.315098e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.683997e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.212411e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.304475e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.511064e+00 s Time to initialize coeftab 1.874008e+01 s Start 3457: mpi_dst_example_simple_lap_z_facto2_sched4_not_svdbegin Test #3458: mpi_dst_example_simple_lap_z_facto2_sched4_not_svdend ...................***Timeout 391.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.885454e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.180851e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.304417e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.419026e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.156102e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.124876e-01 s Time to initialize coeftab 1.926123e+00 s Start 3458: mpi_dst_example_simple_lap_z_facto2_sched4_not_svdend Test #3459: mpi_dst_example_simple_lap_z_facto2_sched4_kway_svdbegin ................***Timeout 391.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.855648e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.895617e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.277831e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.763478e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.399027e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.097798e+01 s Time to initialize coeftab 1.831765e+01 s Start 3459: mpi_dst_example_simple_lap_z_facto2_sched4_kway_svdbegin Test #3460: mpi_dst_example_simple_lap_z_facto2_sched4_kway_svdend ..................***Timeout 391.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.537555e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.624284e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.305494e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.917715e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.984388e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.099516e+00 s Time to initialize coeftab 3.317188e+00 s Time to factorize 1.310715e+02 s (312.27 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Start 3460: mpi_dst_example_simple_lap_z_facto2_sched4_kway_svdend Test #3461: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_svdbegin .....***Timeout 391.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.038670e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.500424e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.602499e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.803405e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.748237e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.356441e+01 s Time to initialize coeftab 2.018224e+01 s Start 3461: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_svdbegin Test #3462: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_svdend .......***Timeout 391.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.731885e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.307630e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.427762e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.773740e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.174554e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.014818e+01 s Time to initialize coeftab 2.422656e+00 s Start 3462: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_svdend Test #3463: mpi_dst_example_simple_lap_z_facto2_sched4_not_pqrcpbegin ...............***Timeout 391.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.187898e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.357252e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.296774e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.862998e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.279904e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 9.881524e+00 s Time to initialize coeftab 2.067084e+01 s Start 3463: mpi_dst_example_simple_lap_z_facto2_sched4_not_pqrcpbegin Test #3465: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpbegin ..............***Timeout 391.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.625637e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.054741e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.497819e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.673725e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.972738e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.010724e+01 s Time to initialize coeftab 9.268978e+00 s Start 3465: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpbegin Test #3466: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpend ................***Timeout 391.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.928512e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.109312e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.035962e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.651750e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.401251e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.239599e+01 s Time to initialize coeftab 2.605751e+00 s Time to factorize 9.295791e+01 s (440.30 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko Start 3466: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpend Test #3467: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpbegin ...***Timeout 391.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.047870e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.546618e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.060113e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.725143e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.507134e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.038100e+01 s Time to initialize coeftab 8.467000e+00 s Start 3467: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpbegin Test #3468: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpend .....***Timeout 391.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.725268e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.092252e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.710959e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.150348e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.115155e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.440506e+01 s Time to initialize coeftab 2.069766e+00 s Time to factorize 9.562205e+01 s (428.03 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Start 3468: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpend Test #3469: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrcpbegin ...............***Timeout 391.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.886913e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.027026e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.392997e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.068891e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.972619e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.618145e+00 s Time to initialize coeftab 1.454408e+01 s Start 3469: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrcpbegin Test #3471: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrcpbegin ..............***Timeout 391.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.951088e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.963025e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.149767e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.547960e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.634549e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.577036e-01 s Time to initialize coeftab 1.660969e+01 s Start 3471: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrcpbegin Test #3472: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrcpend ................***Timeout 391.11 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.494370e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.069813e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.269771e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.609012e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.090852e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.191925e+00 s Time to initialize coeftab 2.354661e+00 s Time to factorize 1.437669e+02 s (284.69 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Start 3472: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrcpend Test #3473: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpbegin ...***Timeout 391.16 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.829671e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.641342e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.810036e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.022933e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.098561e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.253205e+01 s Time to initialize coeftab 2.278400e+01 s Start 3473: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpbegin Test #3474: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpend .....***Timeout 391.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.063057e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.241854e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.387190e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.439728e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.530619e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.149593e+00 s Time to initialize coeftab 2.644701e+00 s Time to factorize 1.278621e+02 s (320.11 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Start 3474: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpend Test #3475: mpi_dst_example_simple_lap_z_facto2_sched4_not_tqrcpbegin ...............***Timeout 391.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.459315e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.120805e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.531788e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.259709e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.970686e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.228978e+00 s Time to initialize coeftab 2.539601e+01 s Start 3475: mpi_dst_example_simple_lap_z_facto2_sched4_not_tqrcpbegin Test #3476: mpi_dst_example_simple_lap_z_facto2_sched4_not_tqrcpend .................***Timeout 391.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.454400e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.122918e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.292577e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.752527e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.894552e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.868989e+00 s Time to initialize coeftab 1.874935e+00 s Time to factorize 1.367401e+02 s (299.32 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Start 3476: mpi_dst_example_simple_lap_z_facto2_sched4_not_tqrcpend Test #3477: mpi_dst_example_simple_lap_z_facto2_sched4_kway_tqrcpbegin ..............***Timeout 391.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.047809e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.313759e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.397464e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.708982e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.887471e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.073024e-01 s Time to initialize coeftab 1.322143e+01 s Start 3477: mpi_dst_example_simple_lap_z_facto2_sched4_kway_tqrcpbegin Test #3478: mpi_dst_example_simple_lap_z_facto2_sched4_kway_tqrcpend ................***Timeout 391.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.026620e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.070355e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.948742e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.415822e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.073798e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.471102e+00 s Time to initialize coeftab 1.887214e+00 s Start 3478: mpi_dst_example_simple_lap_z_facto2_sched4_kway_tqrcpend Test #3479: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpbegin ...***Timeout 391.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.991340e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.883291e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.309418e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.542095e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.641430e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.059207e+00 s Time to initialize coeftab 1.879322e+01 s Start 3479: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpbegin Test #3480: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpend .....***Timeout 391.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.774897e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.102056e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.270448e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.710204e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.104622e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.188838e+01 s Time to initialize coeftab 2.479019e+00 s Start 3480: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpend Test #3481: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrrtbegin ...............***Timeout 391.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.583961e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.123638e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.636918e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.358163e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.415669e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.173095e+00 s Time to initialize coeftab 2.253025e+01 s Start 3481: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrrtbegin Test #3482: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrrtend .................***Timeout 393.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.762764e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.646307e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.788830e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.450022e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.386987e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.467260e+00 s Time to initialize coeftab 6.659054e+00 s Time to factorize 1.020700e+02 s (400.99 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Start 3482: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrrtend Test #3483: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrrtbegin ..............***Timeout 393.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.901872e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.626166e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.910156e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.805685e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.007408e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.687228e+00 s Time to initialize coeftab 1.364679e+01 s Start 3483: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrrtbegin Test #3484: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrrtend ................***Timeout 393.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.034650e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.671484e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.123510e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.995187e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 8.458729e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.587923e+00 s Time to initialize coeftab 2.836147e+00 s Start 3484: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrrtend Test #3485: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtbegin ...***Timeout 390.97 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.581932e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.569333e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.475156e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 2.159092e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.642440e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.164587e+00 s Time to initialize coeftab 1.505657e+01 s Start 3485: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtbegin Test #3486: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtend .....***Timeout 388.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.529177e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.134529e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.688175e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.584281e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.313348e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.913378e+00 s Time to initialize coeftab 1.801936e+00 s Time to factorize 1.231180e+02 s (332.44 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Start 3486: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtend Test #3487: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpilu0 ...............***Timeout 387.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.600650e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.011137e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.010888e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.809158e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.535647e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.122321e+01 s Time to initialize coeftab 2.089970e+00 s Start 3487: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpilu0 Test #3488: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpilu1 ...............***Timeout 387.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Start 3488: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpilu1 Test #3489: mpi_dst_example_simple_lap_z_facto3_sched4_not_svdbegin .................***Timeout 387.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.036573e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.201071e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.162955e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.182981e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.778244e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.232301e+01 s Time to initialize coeftab 1.460862e+01 s Start 3489: mpi_dst_example_simple_lap_z_facto3_sched4_not_svdbegin Test #3490: mpi_dst_example_simple_lap_z_facto3_sched4_not_svdend ...................***Timeout 387.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.464271e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.596247e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.609064e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.174114e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.596508e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.238207e+01 s Time to initialize coeftab 2.209992e+00 s Start 3490: mpi_dst_example_simple_lap_z_facto3_sched4_not_svdend Test #3492: mpi_dst_example_simple_lap_z_facto3_sched4_kway_svdend ..................***Timeout 387.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 9.656019e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.098586e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.870586e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.413773e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.319203e+02 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.197043e+00 s Time to initialize coeftab 1.502274e+00 s Start 3492: mpi_dst_example_simple_lap_z_facto3_sched4_kway_svdend Test #3493: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_svdbegin .....***Timeout 387.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.346850e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.748987e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.779123e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.941516e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.941653e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.811420e+00 s Time to initialize coeftab 1.475413e+01 s Start 3493: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_svdbegin Test #3494: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_svdend .......***Timeout 387.11 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.047260e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.806700e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.281218e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.291562e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 9.933245e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.010099e+01 s Time to initialize coeftab 1.983303e+00 s Start 3494: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_svdend Test #3495: mpi_dst_example_simple_lap_z_facto3_sched4_not_pqrcpbegin ...............***Timeout 387.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 2: 200 760 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.094492e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.872111e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.051827e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.221499e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.545145e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.947119e+00 s Time to initialize coeftab 8.389552e+00 s Start 3495: mpi_dst_example_simple_lap_z_facto3_sched4_not_pqrcpbegin Test #3496: mpi_dst_example_simple_lap_z_facto3_sched4_not_pqrcpend .................***Timeout 387.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.593592e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.942937e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.003107e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.137717e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.505384e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.469130e+00 s Time to initialize coeftab 2.070120e+00 s Time to factorize 7.273828e+01 s (285.51 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Start 3496: mpi_dst_example_simple_lap_z_facto3_sched4_not_pqrcpend Test #3497: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpbegin ..............***Timeout 388.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.696974e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.644103e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.625644e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.594648e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.441374e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.581318e+00 s Time to initialize coeftab 6.417302e+00 s Start 3497: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpbegin Test #3499: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpbegin ...***Timeout 387.95 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.061502e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.558561e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.019213e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.984070e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.972454e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.464523e+01 s Time to initialize coeftab 7.619028e+00 s Start 3499: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpbegin Test #3500: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpend .....***Timeout 387.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.799745e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.157728e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.554834e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.870456e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.296318e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.548380e+00 s Time to initialize coeftab 2.709828e+00 s Start 3500: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpend Test #3501: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrcpbegin ...............***Timeout 386.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.431234e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.777378e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.618819e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.807187e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.289077e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.166329e+00 s Time to initialize coeftab 1.499428e+01 s Start 3501: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrcpbegin Test #3502: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrcpend .................***Timeout 385.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.958232e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.370802e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.053577e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.099841e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.786548e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.932216e+00 s Time to initialize coeftab 1.724122e+00 s Time to factorize 9.148818e+01 s (227.00 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Start 3502: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrcpend Test #3503: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrcpbegin ..............***Timeout 385.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.387396e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.488222e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.391894e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.873978e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.438097e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.384919e+00 s Time to initialize coeftab 1.659297e+01 s Start 3503: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrcpbegin Test #3504: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrcpend ................***Timeout 386.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.564745e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.109989e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.132075e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.533029e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.411955e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.114517e+01 s Time to initialize coeftab 2.223774e+00 s Start 3504: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrcpend Test #3505: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpbegin ...***Timeout 386.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.223539e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.063063e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.656442e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.430717e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.242992e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.431502e+01 s Time to initialize coeftab 1.596104e+01 s Start 3505: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpbegin Test #3506: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpend .....***Timeout 387.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.845424e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.353940e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.745843e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.126536e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.694832e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 8.418149e+00 s Time to initialize coeftab 2.647063e+00 s Start 3506: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpend Test #3507: mpi_dst_example_simple_lap_z_facto3_sched4_not_tqrcpbegin ...............***Timeout 387.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.795741e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.211051e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.242663e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.991435e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.887223e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.125320e+01 s Time to initialize coeftab 2.055876e+01 s Start 3507: mpi_dst_example_simple_lap_z_facto3_sched4_not_tqrcpbegin Test #3508: mpi_dst_example_simple_lap_z_facto3_sched4_not_tqrcpend .................***Timeout 387.47 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.244909e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.564819e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.541518e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.507848e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.720981e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.378150e+00 s Time to initialize coeftab 1.685675e+00 s Time to factorize 1.069376e+02 s (194.20 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 3508: mpi_dst_example_simple_lap_z_facto3_sched4_not_tqrcpend Test #3509: mpi_dst_example_simple_lap_z_facto3_sched4_kway_tqrcpbegin ..............***Timeout 386.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.133236e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.210184e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.407697e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.974735e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.808577e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.047600e+00 s Time to initialize coeftab 2.165704e+01 s Start 3509: mpi_dst_example_simple_lap_z_facto3_sched4_kway_tqrcpbegin Test #3258: mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrrtend .................***Timeout 304.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.746410e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.004995e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.520533e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.495444e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.642651e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 7.710105e+00 s Time to initialize coeftab 1.646991e+00 s Test #3284: mpi_dst_example_simple_lap_c_facto1_sched4_not_tqrcpend .................***Timeout 275.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.992388e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.409733e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.786465e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.986531e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.051078e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.088222e+00 s Time to initialize coeftab 1.625554e+00 s Test #3432: mpi_dst_example_simple_lap_z_facto1_sched4_not_pqrcpend .................***Timeout 206.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.700277e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.767958e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.003681e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.031674e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.140596e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.894621e+00 s Time to initialize coeftab 8.008039e-01 s Test #3434: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpend ................***Timeout 205.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.640178e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.593680e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.554599e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.542290e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.440591e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.053406e+00 s Time to initialize coeftab 8.535071e-01 s Test #3438: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrcpend .................***Timeout 204.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.466378e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.523194e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.476756e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Test #3578: bcsc_mpi_rep_test_bcsc_spmv_tests_hb ....................................***Timeout 160.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3578: bcsc_mpi_rep_test_bcsc_spmv_tests_hb Test #3588: bcsc_mpi_rep_test_bvec_gemv_tests .......................................***Timeout 158.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Case Float - Sequential: SUCCESS Case Float - Sequential: SUCCESS Case Double - Sequential: SUCCESS Case Double - Sequential: SUCCESS Case Complex32 - Sequential: SUCCESS Case Complex32 - Sequential: SUCCESS Case Float - Sequential: SUCCESS Case Complex64 - Sequential: SUCCESS Case Complex64 - Sequential: SUCCESS Case Double - Sequential: SUCCESS Case Complex32 - Sequential: SUCCESS Case Complex64 - Sequential: SUCCESS Case Float - Static: SUCCESS Case Float - Static: SUCCESS Case Float - Static: SUCCESS Case Double - Static: SUCCESS Case Double - Static: SUCCESS Case Double - Static: SUCCESS Case Complex32 - Static: SUCCESS Case Complex32 - Static: SUCCESS Case Complex64 - Static: SUCCESS -- All tests PASSED -- Case Complex64 - Static: SUCCESS -- All tests PASSED -- Start 3588: bcsc_mpi_rep_test_bvec_gemv_tests Test #3621: python_shm_simple .......................................................***Timeout 149.51 sec Start 3621: python_shm_simple Test #3625: python_shm_simple_obj ...................................................***Timeout 148.19 sec Start 3625: python_shm_simple_obj Test #3343: mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrcpbegin ..............***Failed 145.39 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP ischedInit: The thread number has been automatically set to 256 Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.441318e+01 s [arch-nspawn-3655178:977173] *** Process received signal *** [arch-nspawn-3655178:977173] Signal: Segmentation fault (11) [arch-nspawn-3655178:977173] Signal code: Address not mapped (1) [arch-nspawn-3655178:977173] Failing at address: 0x7feb001a1960 [arch-nspawn-3655178:977173] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7f1c4f6d46cc] [arch-nspawn-3655178:977173] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7f1c454f3a02] [arch-nspawn-3655178:977173] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7f1c454f4504] [arch-nspawn-3655178:977173] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7f1c454a5a7a] [arch-nspawn-3655178:977173] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7f1c454d2aa2] [arch-nspawn-3655178:977173] [ 5] /usr/lib/libmpi.so.40(ompi_request_default_wait_all+0x126) [0x7f1c45c80840] [arch-nspawn-3655178:977173] [ 6] /usr/lib/libmpi.so.40(ompi_coll_base_bcast_intra_generic+0x404) [0x7f1c45ceb822] [arch-nspawn-3655178:977173] [ 7] /usr/lib/libmpi.so.40(ompi_coll_tuned_bcast_intra_dec_fixed+0x34) [0x7f1c45d1639c] [arch-nspawn-3655178:977173] [ 8] /usr/lib/libmpi.so.40(PMPI_Bcast+0xfc) [0x7f1c45c9015c] [arch-nspawn-3655178:977173] [ 9] /build/pastix/src/build/libpastix.so.6.4(pastixOrderBcast+0x9e) [0x7f1c4e97a45a] [arch-nspawn-3655178:977173] [10] /build/pastix/src/build/libpastix.so.6.4(pastix_subtask_order+0x42c) [0x7f1c4e984496] [arch-nspawn-3655178:977173] [11] /build/pastix/src/build/libpastix.so.6.4(pastix_task_analyze+0x3a) [0x7f1c4e98eb34] [arch-nspawn-3655178:977173] [12] ./simple(+0xe8e) [0x555555556e8e] [arch-nspawn-3655178:977173] [13] /usr/lib/libc.so.6(+0x27fae) [0x7f1c45aa4fae] [arch-nspawn-3655178:977173] [14] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7f1c45aa50b8] [arch-nspawn-3655178:977173] [15] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:977173] *** End of error message *** [arch-nspawn-3655178:977327] *** Process received signal *** [arch-nspawn-3655178:977327] Signal: Segmentation fault (11) [arch-nspawn-3655178:977327] Signal code: Address not mapped (1) [arch-nspawn-3655178:977327] Failing at address: 0x7fa82cca26e0 [arch-nspawn-3655178:977327] [ 0] linux-vdso.so.1(__vdso_rt_sigreturn+0x0) [0x7ff7324c46cc] [arch-nspawn-3655178:977327] [ 1] /usr/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x18a) [0x7ff7282f3a02] [arch-nspawn-3655178:977327] [ 2] /usr/lib/libopen-pal.so.80(+0x74504) [0x7ff7282f4504] [arch-nspawn-3655178:977327] [ 3] /usr/lib/libopen-pal.so.80(opal_progress+0x30) [0x7ff7282a5a7a] [arch-nspawn-3655178:977327] [ 4] /usr/lib/libopen-pal.so.80(ompi_sync_wait_mt+0xda) [0x7ff7282d2aa2] [arch-nspawn-3655178:977327] [ 5] /usr/lib/libmpi.so.40(ompi_request_default_wait_all+0x126) [0x7ff728a80840] [arch-nspawn-3655178:977327] [ 6] /usr/lib/libmpi.so.40(ompi_coll_base_bcast_intra_generic+0x572) [0x7ff728aeb990] [arch-nspawn-3655178:977327] [ 7] /usr/lib/libmpi.so.40(ompi_coll_tuned_bcast_intra_dec_fixed+0x34) [0x7ff728b1639c] [arch-nspawn-3655178:977327] [ 8] /usr/lib/libmpi.so.40(PMPI_Bcast+0xfc) [0x7ff728a9015c] [arch-nspawn-3655178:977327] [ 9] /build/pastix/src/build/libpastix.so.6.4(pastixOrderBcast+0x9e) [0x7ff7316d145a] [arch-nspawn-3655178:977327] [10] /build/pastix/src/build/libpastix.so.6.4(pastix_subtask_order+0x42c) [0x7ff7316db496] [arch-nspawn-3655178:977327] [11] /build/pastix/src/build/libpastix.so.6.4(pastix_task_analyze+0x3a) [0x7ff7316e5b34] [arch-nspawn-3655178:977327] [12] ./simple(+0xe8e) [0x555555556e8e] [arch-nspawn-3655178:977327] [13] /usr/lib/libc.so.6(+0x27fae) [0x7ff7288a4fae] [arch-nspawn-3655178:977327] [14] /usr/lib/libc.so.6(__libc_start_main+0x72) [0x7ff7288a50b8] [arch-nspawn-3655178:977327] [15] ./simple(+0x1174) [0x555555557174] [arch-nspawn-3655178:977327] *** End of error message *** -------------------------------------------------------------------------- prte noticed that process rank 0 with PID 977173 on node arch-nspawn-3655178 exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- Test #3565: bcsc_shm_test_bcsc_spmv_time_rsa ........................................ Passed 125.64 sec Test #3570: bcsc_shm_test_bvec_tests ................................................ Passed 136.24 sec Test #3556: bcsc_shm_test_bcsc_spmv_tests_lap_z ..................................... Passed 44.50 sec Test #3555: bcsc_shm_test_bcsc_spmv_tests_lap_c ..................................... Passed 45.93 sec Test #3560: bcsc_shm_test_bcsc_spmv_tests_mm2 ....................................... Passed 44.54 sec Test #3571: bcsc_shm_test_bvec_applyorder_tests ..................................... Passed 49.10 sec Test #3553: bcsc_shm_test_bcsc_spmv_tests_lap_s ..................................... Passed 65.90 sec Test #3568: bcsc_shm_test_bcsc_spmv_time_mm2 ........................................ Passed 63.91 sec Test #3259: mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrrtbegin ..............***Timeout 295.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.664514e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.881561e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.350706e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.251337e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.740383e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.423991e+00 s Time to initialize coeftab 6.903632e+00 s Test #3261: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtbegin ...***Timeout 295.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.070787e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.978446e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.066808e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.755731e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.562753e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.861012e+00 s Time to initialize coeftab 5.786701e+00 s Test #3262: mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtend .....***Timeout 295.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.616022e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.921793e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.531654e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.430920e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.115539e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.860091e+00 s Time to initialize coeftab 1.253635e+00 s Test #3264: mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpilu1 ...............***Timeout 295.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.356342e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.758808e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.897951e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.157287e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.602358e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 9.281431e-01 s Time to initialize coeftab 4.827215e-01 s Test #3265: mpi_dst_example_simple_lap_c_facto1_sched4_not_svdbegin .................***Timeout 295.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.308820e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.946176e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.944709e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.675342e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.081679e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.164903e+00 s Time to initialize coeftab 5.237130e+00 s Test #3266: mpi_dst_example_simple_lap_c_facto1_sched4_not_svdend ...................***Timeout 295.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.465326e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.000253e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.819115e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.571934e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.596124e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.990216e+00 s Time to initialize coeftab 3.871693e-01 s Test #3267: mpi_dst_example_simple_lap_c_facto1_sched4_kway_svdbegin ................***Timeout 295.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.638722e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.937304e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.908779e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.275863e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.670524e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.917259e+00 s Time to initialize coeftab 4.995983e+00 s Test #3268: mpi_dst_example_simple_lap_c_facto1_sched4_kway_svdend ..................***Timeout 295.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.848097e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.823211e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.867926e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.140942e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.185961e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.882993e+00 s Time to initialize coeftab 7.766733e-01 s Test #3269: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_svdbegin .....***Timeout 295.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.165272e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.801775e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.655066e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.373038e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.218866e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.334283e-01 s Time to initialize coeftab 5.768817e+00 s Test #3270: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_svdend .......***Timeout 295.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch 1: 300 1140 2: 200 760 3: 200 660 Time to compute ordering 1.803483e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.453951e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.431159e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.618072e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.869609e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.317764e+00 s Time to initialize coeftab 6.070448e-01 s Test #3271: mpi_dst_example_simple_lap_c_facto1_sched4_not_pqrcpbegin ...............***Timeout 295.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.466161e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.680608e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.777236e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.901605e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.310379e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.574251e+00 s Time to initialize coeftab 3.175995e+00 s Test #3273: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpbegin ..............***Timeout 295.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.970018e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.509440e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.060872e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.494607e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.928440e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.596434e+00 s Time to initialize coeftab 2.097427e+00 s Test #3275: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpbegin ...***Timeout 295.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.186817e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.085199e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.560770e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.257944e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.692758e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.547297e+00 s Time to initialize coeftab 2.051067e+00 s Test #3276: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpend .....***Timeout 295.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.412321e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.819396e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.970140e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.482772e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.122601e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.759607e+00 s Time to initialize coeftab 7.040381e-01 s Time to factorize 3.891251e+01 s (560.73 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 7.623278e+00 s Time for refinement 9.105576e+00 s || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.130383e-07 max(|| b_i - A x_i ||_1) 1.096060e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.765741e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.130383e-07 max(|| b_i - A x_i ||_1) 1.096060e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.765741e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.130383e-07 max(|| b_i - A x_i ||_1) 1.096060e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.765741e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 7.130383e-07 max(|| b_i - A x_i ||_1) 1.096060e-07 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.765741e+00 (SUCCESS) Test #3277: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrcpbegin ...............***Timeout 295.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.521119e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.729638e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.420793e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.461958e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.434560e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.775850e-01 s Time to initialize coeftab 6.107887e+00 s Test #3278: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrcpend .................***Timeout 295.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.895468e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.194989e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.055563e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.089202e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.877570e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.486263e+00 s Time to initialize coeftab 8.096697e-01 s Time to factorize 7.374597e+01 s (295.87 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 1.974374e+01 s Time for refinement 7.731877e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 Test #3279: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrcpbegin ..............***Timeout 295.51 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.469680e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.597197e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.556396e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.483363e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.586593e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.772979e+00 s Time to initialize coeftab 5.313659e+00 s Test #3280: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrcpend ................***Timeout 295.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.408154e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.406134e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.815476e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.603996e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.630538e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.598769e+00 s Time to initialize coeftab 7.851567e-01 s Time to factorize 6.051715e+01 s (360.55 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3281: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpbegin ...***Timeout 295.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.260567e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.106927e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.638948e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.327468e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.947902e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.443997e+00 s Time to initialize coeftab 4.527080e+00 s Test #3282: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpend .....***Timeout 295.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.621732e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.393614e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.082794e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.582229e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.040696e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.836728e+00 s Time to initialize coeftab 9.388845e-01 s Time to factorize 4.302810e+01 s (507.10 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.533033e+01 s - iteration 1 : total iteration time 13.6 s error 1.3513e-11 Test #3283: mpi_dst_example_simple_lap_c_facto1_sched4_not_tqrcpbegin ...............***Timeout 295.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.538118e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.108558e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.904046e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.289334e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.399748e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.152192e+00 s Time to initialize coeftab 9.663518e+00 s Test #3285: mpi_dst_example_simple_lap_c_facto1_sched4_kway_tqrcpbegin ..............***Timeout 295.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.653980e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.316102e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.107442e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.254264e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.858697e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.478685e+00 s Time to initialize coeftab 6.731111e+00 s Test #3286: mpi_dst_example_simple_lap_c_facto1_sched4_kway_tqrcpend ................***Timeout 295.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.541927e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.796541e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.817336e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.264453e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.300749e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.129406e+00 s Time to initialize coeftab 6.213960e-01 s Time to factorize 6.009830e+01 s (363.06 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Test #3287: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpbegin ...***Timeout 295.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.276741e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.701448e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.617279e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.379999e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.588900e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.598312e+00 s Time to initialize coeftab 6.895275e+00 s Test #3288: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpend .....***Timeout 295.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.881157e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.338057e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.057634e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.868583e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.126480e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.252399e+00 s Time to initialize coeftab 1.127998e+00 s Test #3289: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrrtbegin ...............***Timeout 294.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.663557e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.894178e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.858890e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.380104e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.422643e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.758670e+00 s Time to initialize coeftab 6.754361e+00 s Test #3290: mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrrtend .................***Timeout 294.35 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.396140e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.555228e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.336398e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.617585e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.521442e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.275036e-01 s Time to initialize coeftab 4.824463e-01 s Time to factorize 5.818719e+01 s (374.99 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 2.890264e+01 s Test #3291: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrrtbegin ..............***Timeout 294.35 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.226091e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.204704e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.712991e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.797363e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.214930e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.891316e+00 s Time to initialize coeftab 1.357723e+01 s Test #3292: mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrrtend ................***Timeout 294.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.356877e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.009858e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.190766e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.130824e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.345034e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.936185e+00 s Time to initialize coeftab 9.141627e-01 s Time to factorize 5.628790e+01 s (387.64 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 88.6 Ko / 88.6 Ko ------------------------------------------------ Total 137 Ko / 137 Ko Time to solve 4.659684e+01 s Time for refinement 4.006019e+00 s || A ||_1 5.112499e-02 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112499e-02 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468971e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.886458e-07 max(|| b_i - A x_i ||_1) 8.081049e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.039130e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.886458e-07 max(|| b_i - A x_i ||_1) 8.081049e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.039130e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.886458e-07 max(|| b_i - A x_i ||_1) 8.081049e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.039130e+00 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.886458e-07 max(|| b_i - A x_i ||_1) 8.081049e-08 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.039130e+00 (SUCCESS) Test #3293: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtbegin ...***Timeout 294.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.633472e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.446108e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.486742e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.298328e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.801908e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.996491e+00 s Time to initialize coeftab 1.001634e+01 s Test #3294: mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtend .....***Timeout 294.32 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.621296e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.214602e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.351676e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.747601e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.827635e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.066575e+00 s Time to initialize coeftab 5.672400e-01 s Test #3295: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpilu0 ...............***Timeout 293.70 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.814259e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.716295e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.746082e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.844074e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.214815e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Test #3296: mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpilu1 ...............***Timeout 293.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.449843e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.352751e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.290141e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.998562e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.480044e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.814244e+00 s Time to initialize coeftab 4.388421e-01 s Test #3297: mpi_dst_example_simple_lap_c_facto2_sched4_not_svdbegin .................***Timeout 293.68 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.454106e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.692455e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.937407e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.194037e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.436736e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.251492e+00 s Time to initialize coeftab 4.826782e+00 s Test #3299: mpi_dst_example_simple_lap_c_facto2_sched4_kway_svdbegin ................***Timeout 293.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.408261e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.530160e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.541175e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.037152e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.800520e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.996310e+00 s Time to initialize coeftab 1.254296e+01 s Test #3300: mpi_dst_example_simple_lap_c_facto2_sched4_kway_svdend ..................***Timeout 293.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.700977e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.332045e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.600455e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 6.356499e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.720609e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.635141e+00 s Time to initialize coeftab 4.025055e-01 s Time to factorize 8.340951e+01 s (490.70 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3301: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_svdbegin .....***Timeout 293.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.269764e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.927190e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.313533e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 7.076519e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.325049e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.338112e+00 s Time to initialize coeftab 5.661837e+00 s Test #3302: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_svdend .......***Timeout 293.64 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.764878e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.398563e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.165385e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.516214e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.175309e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.320584e+00 s Time to initialize coeftab 1.507908e+00 s Test #3303: mpi_dst_example_simple_lap_c_facto2_sched4_not_pqrcpbegin ...............***Timeout 293.63 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.591927e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.319768e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.378295e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.378167e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.384112e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.890998e+00 s Time to initialize coeftab 2.555995e+00 s Time to factorize 1.365134e+02 s (299.82 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Test #3305: mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpbegin ..............***Timeout 293.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.363620e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.032441e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.540054e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.790857e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.156612e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.787053e+00 s Time to initialize coeftab 4.890832e+00 s Test #3325: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtbegin ...***Timeout 292.93 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.694471e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.291039e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.620344e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.303912e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.097075e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.802169e+00 s Time to initialize coeftab 1.822847e+01 s Test #3326: mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtend .....***Timeout 292.91 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.966189e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.144997e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.744559e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.086321e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.363870e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.044915e+00 s Time to initialize coeftab 6.027839e-01 s Time to factorize 5.358413e+01 s (763.83 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 48.4 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 226 Ko / 226 Ko Time to solve 3.823863e+01 s Test #3329: mpi_dst_example_simple_lap_c_facto3_sched4_not_svdbegin .................***Timeout 292.90 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.523873e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.077906e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.946651e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.747140e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.871991e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.433947e+00 s Time to initialize coeftab 5.720503e+00 s Test #3331: mpi_dst_example_simple_lap_c_facto3_sched4_kway_svdbegin ................***Timeout 292.89 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.024331e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.311861e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.193107e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.544056e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.117077e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.294694e+00 s Time to initialize coeftab 4.783139e+00 s Test #3332: mpi_dst_example_simple_lap_c_facto3_sched4_kway_svdend ..................***Timeout 292.86 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.311876e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.173479e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.276864e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.026428e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.198822e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.255284e+00 s Time to initialize coeftab 5.971719e-01 s Test #3333: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_svdbegin .....***Timeout 292.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.465297e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.676333e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.501173e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.038474e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.940507e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.928841e+00 s Time to initialize coeftab 5.865069e+00 s Test #3345: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpbegin ...***Timeout 292.83 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.442378e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.925562e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.022692e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.328620e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.325990e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.048388e+00 s Time to initialize coeftab 6.998773e+00 s Test #3349: mpi_dst_example_simple_lap_c_facto3_sched4_kway_tqrcpbegin ..............***Timeout 291.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.394858e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.161836e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.673022e-02 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.932029e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.063471e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.821850e+00 s Time to initialize coeftab 5.948476e+00 s Test #3351: mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpbegin ...***Timeout 290.08 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #3361: mpi_dst_example_simple_lap_c_facto4_sched4_not_svdbegin .................***Timeout 290.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.512817e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.655358e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.531992e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.414327e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.341175e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.607198e+00 s Time to initialize coeftab 4.528235e+00 s Test #3362: mpi_dst_example_simple_lap_c_facto4_sched4_not_svdend ...................***Timeout 290.06 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.553343e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.800608e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.330771e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.001558e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.785080e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.097793e+00 s Time to initialize coeftab 9.651505e-01 s Test #3363: mpi_dst_example_simple_lap_c_facto4_sched4_kway_svdbegin ................***Timeout 290.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.759398e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.388017e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.742966e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.948151e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.034972e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.086729e+00 s Time to initialize coeftab 8.740064e+00 s Test #3365: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_svdbegin .....***Timeout 289.46 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.392097e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.983938e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.171654e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.961987e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.022626e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.799258e+00 s Time to initialize coeftab 9.437297e+00 s Test #3367: mpi_dst_example_simple_lap_c_facto4_sched4_not_pqrcpbegin ...............***Timeout 288.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.914990e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.402673e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.281680e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.068872e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.866363e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.376649e+00 s Time to initialize coeftab 3.222206e+00 s Test #3369: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpbegin ..............***Timeout 287.93 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.302065e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.763255e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.890733e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.204156e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.339709e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.200581e+00 s Time to initialize coeftab 5.287072e+00 s Test #3371: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpbegin ...***Timeout 286.63 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.489768e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.234548e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.813692e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.220123e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.189351e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.603536e+00 s Time to initialize coeftab 5.024005e+00 s Test #3373: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrcpbegin ...............***Timeout 286.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.521048e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.585520e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.562114e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.366315e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.666524e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.315135e+00 s Time to initialize coeftab 1.096888e+01 s Test #3374: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrcpend .................***Timeout 286.03 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.925923e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.364051e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.338135e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.583545e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.650640e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.750207e+00 s Time to initialize coeftab 1.056058e+00 s Test #3375: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrcpbegin ..............***Timeout 286.02 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.343739e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.416978e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.151327e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.575812e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.501648e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.323865e+00 s Test #3377: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpbegin ...***Timeout 285.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.746597e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.753310e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.333070e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Test #3379: mpi_dst_example_simple_lap_c_facto4_sched4_not_tqrcpbegin ...............***Timeout 285.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.361744e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.508943e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.179809e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.215648e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.550217e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Test #3381: mpi_dst_example_simple_lap_c_facto4_sched4_kway_tqrcpbegin ..............***Timeout 285.74 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.654121e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.072120e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.410208e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.509907e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.584314e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.525500e+00 s Time to initialize coeftab 1.171009e+01 s Test #3383: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpbegin ...***Timeout 285.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.819410e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.421232e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.972381e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.166921e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.855997e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.777117e+00 s Time to initialize coeftab 6.461773e+00 s Test #3385: mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrrtbegin ...............***Timeout 285.70 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Test #3387: mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrrtbegin ..............***Timeout 285.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.429643e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.552487e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.577417e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.811093e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.545714e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.028692e+00 s Time to initialize coeftab 5.277089e+00 s Test #3389: mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtbegin ...***Timeout 285.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.775522e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.868332e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.122361e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.053364e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.945278e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.569790e+00 s Time to initialize coeftab 7.063130e+00 s Test #3392: mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpilu1 ...............***Timeout 285.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.210688e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.801127e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.364643e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.275394e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.206387e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.640473e+00 s Time to initialize coeftab 1.309019e+00 s Test #3393: mpi_dst_example_simple_lap_z_facto0_sched4_not_svdbegin .................***Timeout 285.42 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.228663e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.932026e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.494193e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.384335e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.737559e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.736018e+00 s Time to initialize coeftab 1.104968e+01 s Test #3394: mpi_dst_example_simple_lap_z_facto0_sched4_not_svdend ...................***Timeout 285.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.459922e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.327955e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.338100e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.982287e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.126425e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.185075e+00 s Time to initialize coeftab 5.750002e-01 s Test #3395: mpi_dst_example_simple_lap_z_facto0_sched4_kway_svdbegin ................***Timeout 285.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.498913e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.135974e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.959077e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.077751e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.582289e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.333571e+00 s Time to initialize coeftab 5.141794e+00 s Test #3396: mpi_dst_example_simple_lap_z_facto0_sched4_kway_svdend ..................***Timeout 285.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.428788e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.540704e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.311976e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.576035e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.213687e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.552483e+00 s Time to initialize coeftab 6.949929e-01 s Test #3397: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_svdbegin .....***Timeout 285.39 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.189612e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.508267e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.577067e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.648377e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.234231e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.425617e+00 s Time to initialize coeftab 8.401218e+00 s Test #3399: mpi_dst_example_simple_lap_z_facto0_sched4_not_pqrcpbegin ...............***Timeout 285.38 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.342690e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.259829e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.693776e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.034813e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.093836e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.398667e+00 s Time to initialize coeftab 1.653162e+00 s Test #3401: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpbegin ..............***Timeout 285.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.232791e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.090909e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.981167e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.127959e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.679143e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.133530e+00 s Time to initialize coeftab 1.642978e+00 s Test #3403: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpbegin ...***Timeout 285.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.241379e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.739675e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.887210e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 2.790800e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.562174e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.715339e+00 s Time to initialize coeftab 1.391334e+00 s Test #3405: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrcpbegin ...............***Timeout 285.35 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.373799e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.969958e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.706487e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.792078e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.503651e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.932013e+00 s Time to initialize coeftab 3.824518e+00 s Test #3407: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrcpbegin ..............***Timeout 285.35 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.221794e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.364790e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.864949e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.018355e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.340444e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.149990e+00 s Time to initialize coeftab 6.758028e+00 s Test #3409: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpbegin ...***Timeout 285.33 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.243257e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.045217e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.946249e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.177825e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.384643e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 4.698370e-02 s Time to initialize coeftab 7.348102e+00 s Test #3411: mpi_dst_example_simple_lap_z_facto0_sched4_not_tqrcpbegin ...............***Timeout 285.33 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.383645e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.306938e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.616318e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.072422e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.500884e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.149084e+00 s Time to initialize coeftab 4.508972e+00 s Test #3412: mpi_dst_example_simple_lap_z_facto0_sched4_not_tqrcpend .................***Timeout 285.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.812226e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.642024e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.022085e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.120539e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.481386e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.322542e+00 s Time to initialize coeftab 8.568466e-01 s Test #3413: mpi_dst_example_simple_lap_z_facto0_sched4_kway_tqrcpbegin ..............***Timeout 285.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.410412e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.101319e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.415659e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.368024e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.330811e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.937521e+00 s Time to initialize coeftab 5.504708e+00 s Test #3414: mpi_dst_example_simple_lap_z_facto0_sched4_kway_tqrcpend ................***Timeout 285.30 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.346006e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.078666e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.151679e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.521637e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.525639e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 5.753884e+00 s Time to initialize coeftab 7.449430e-01 s Test #3415: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpbegin ...***Timeout 285.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.294639e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.589714e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.182383e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.149864e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.056163e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.743353e+00 s Time to initialize coeftab 5.720711e+00 s Test #3416: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpend .....***Timeout 285.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.182404e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.743773e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.250500e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.069786e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.865199e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.894054e+00 s Time to initialize coeftab 6.518680e-01 s Test #3417: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrrtbegin ...............***Timeout 285.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.703327e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.707583e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.439578e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.991581e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.416234e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 8.696978e-02 s Time to initialize coeftab 4.662010e+00 s Test #3418: mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrrtend .................***Timeout 285.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.143873e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.349630e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.335774e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.684785e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.066685e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.539669e+00 s Time to initialize coeftab 5.570269e-01 s Test #3419: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrrtbegin ..............***Timeout 285.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.329325e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.586635e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.482929e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.357999e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.522959e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.923366e+00 s Time to initialize coeftab 4.061729e+00 s Test #3420: mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrrtend ................***Timeout 285.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.608085e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.488236e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.115010e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.421276e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.197740e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.256551e+00 s Time to initialize coeftab 7.350294e-01 s Test #3421: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtbegin ...***Timeout 285.25 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.498302e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.033570e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.536346e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 7.360732e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.673055e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 1.937616e+00 s Time to initialize coeftab 5.703848e+00 s Test #3422: mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtend .....***Timeout 285.24 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch 1: 300 1140 2: 200 760 3: 200 660 Time to compute ordering 1.430446e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.131665e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.471302e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.080146e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.324871e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.793481e+00 s Time to initialize coeftab 5.237061e-01 s Test #3423: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpilu0 ...............***Timeout 285.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.347103e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.333863e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.543857e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 5.318713e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.071266e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 3.565336e+00 s Time to initialize coeftab 9.659865e-01 s Time to factorize 8.112057e+01 s (256.01 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 9.489087e+00 s Test #3424: mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpilu1 ...............***Timeout 285.22 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.540926e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.319255e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.964261e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^h 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.254727e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.962339e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^h Time to initialize internal csc 2.229023e+00 s Time to initialize coeftab 7.766235e-01 s Test #3425: mpi_dst_example_simple_lap_z_facto1_sched4_not_svdbegin .................***Timeout 285.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.384164e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.048937e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.947155e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.766253e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.287330e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.620017e+00 s Time to initialize coeftab 4.211473e+00 s Test #3426: mpi_dst_example_simple_lap_z_facto1_sched4_not_svdend ...................***Timeout 285.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.746449e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.923251e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.470272e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.038210e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.521302e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.507018e+00 s Time to initialize coeftab 1.150067e+00 s Test #3427: mpi_dst_example_simple_lap_z_facto1_sched4_kway_svdbegin ................***Timeout 285.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.780315e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.886792e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.114267e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.955836e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.858962e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.329094e+00 s Time to initialize coeftab 4.453199e+00 s Test #3428: mpi_dst_example_simple_lap_z_facto1_sched4_kway_svdend ..................***Timeout 285.19 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.620184e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.216921e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.750659e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.984560e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.594420e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.723882e+00 s Time to initialize coeftab 8.725956e-01 s Test #3429: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_svdbegin .....***Timeout 285.18 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.220000e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.419133e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.063102e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.338933e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.803892e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.557311e+00 s Time to initialize coeftab 4.376742e+00 s Test #3430: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_svdend .......***Timeout 285.17 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.375044e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.256569e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.117531e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.638632e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.304729e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.774514e+00 s Time to initialize coeftab 1.869225e+00 s Time to factorize 7.541059e+01 s (289.34 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Test #3431: mpi_dst_example_simple_lap_z_facto1_sched4_not_pqrcpbegin ...............***Timeout 285.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.615346e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.712292e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.290958e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.824691e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.585584e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.017105e-01 s Time to initialize coeftab 1.860536e+00 s Test #3433: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpbegin ..............***Timeout 285.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.427485e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.363551e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.628873e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.231406e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.497966e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.300100e+00 s Time to initialize coeftab 1.855026e+00 s Test #3435: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpbegin ...***Timeout 285.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.377410e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.465705e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.769853e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.915908e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.296511e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.439142e+00 s Time to initialize coeftab 1.676422e+00 s Test #3436: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpend .....***Timeout 285.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.262861e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.752386e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.747995e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.049844e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.373303e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.212037e+00 s Time to initialize coeftab 5.037056e-01 s Test #3437: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrcpbegin ...............***Timeout 285.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.600052e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.564799e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.991698e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.290601e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.373338e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 3.499034e+00 s Time to initialize coeftab 6.312190e+00 s Test #3439: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrcpbegin ..............***Timeout 285.12 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.589603e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.664281e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.419511e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.035600e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.815285e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.172125e+00 s Time to initialize coeftab 8.376053e+00 s Test #3440: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrcpend ................***Timeout 285.11 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.359959e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.728204e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.039204e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.311714e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.519130e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 2.818732e+00 s Time to initialize coeftab 5.319155e-01 s Test #3441: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpbegin ...***Timeout 285.10 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.324199e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.094762e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.417560e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.285605e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.250422e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 1.432544e+00 s Time to initialize coeftab 4.108780e+00 s Test #3510: mpi_dst_example_simple_lap_z_facto3_sched4_kway_tqrcpend ................***Timeout 285.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.366020e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.099353e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.108829e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 4.797633e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.467189e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 2.393420e+00 s Time to initialize coeftab 6.624880e-01 s Start 3510: mpi_dst_example_simple_lap_z_facto3_sched4_kway_tqrcpend Test #3511: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpbegin ...***Timeout 285.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.652987e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.135354e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.123136e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.074190e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.640690e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 3.497709e+00 s Time to initialize coeftab 6.861093e+00 s Start 3511: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpbegin Test #3512: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpend .....***Timeout 285.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.015054e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.271510e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.567225e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.446262e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.016216e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.166955e-01 s Time to initialize coeftab 9.537415e-01 s Time to factorize 7.488694e+01 s (277.32 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 3512: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpend Test #3513: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrrtbegin ...............***Timeout 285.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.547068e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.909762e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.977850e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 3.095781e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.036407e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.872140e+00 s Time to initialize coeftab 4.758877e+00 s Start 3513: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrrtbegin Test #3514: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrrtend .................***Timeout 285.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.559883e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.021451e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.501146e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.435697e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.099905e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 9.176266e+00 s Time to initialize coeftab 1.567295e+00 s Start 3514: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrrtend Test #3515: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrrtbegin ..............***Timeout 285.04 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.348748e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.149057e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.775752e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.200374e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.663561e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.518611e+00 s Time to initialize coeftab 1.052991e+01 s Start 3515: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrrtbegin Test #3516: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrrtend ................***Timeout 284.99 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.352947e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.773396e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.082607e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.195454e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.969204e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.539067e+00 s Time to initialize coeftab 1.584997e+00 s Start 3516: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrrtend Test #3517: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtbegin ...***Timeout 284.99 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.542703e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.460590e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.646452e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.033092e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.829680e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 1.157429e-01 s Time to initialize coeftab 7.513022e+00 s Start 3517: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtbegin Test #3518: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtend .....***Timeout 284.98 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.164201e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.145414e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.306031e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 6.647895e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.142601e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.552440e+00 s Time to initialize coeftab 1.073804e+01 s Time to factorize 6.275161e+01 s (330.95 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 3518: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtend Test #3519: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpilu0 ...............***Timeout 284.96 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.549285e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.511060e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.615092e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3519: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpilu0 Test #3520: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpilu1 ...............***Timeout 284.94 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.410690e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.923315e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.179867e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.305333e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.659165e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.271840e+00 s Time to initialize coeftab 1.164053e+00 s Start 3520: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpilu1 Test #3521: mpi_dst_example_simple_lap_z_facto4_sched4_not_svdbegin .................***Timeout 284.90 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.959268e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.696886e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.996634e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3521: mpi_dst_example_simple_lap_z_facto4_sched4_not_svdbegin Test #3522: mpi_dst_example_simple_lap_z_facto4_sched4_not_svdend ...................***Timeout 284.87 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.296659e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.399846e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.931900e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.909032e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.745595e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.871751e+00 s Time to initialize coeftab 7.776856e-01 s Start 3522: mpi_dst_example_simple_lap_z_facto4_sched4_not_svdend Test #3523: mpi_dst_example_simple_lap_z_facto4_sched4_kway_svdbegin ................***Timeout 284.86 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.812737e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.376776e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.155455e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.250794e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.731406e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.643813e+00 s Start 3523: mpi_dst_example_simple_lap_z_facto4_sched4_kway_svdbegin Test #3524: mpi_dst_example_simple_lap_z_facto4_sched4_kway_svdend ..................***Timeout 284.86 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.706140e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.178085e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.528416e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.992756e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.686456e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.711986e+00 s Time to initialize coeftab 3.613812e-01 s Start 3524: mpi_dst_example_simple_lap_z_facto4_sched4_kway_svdend Test #3525: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_svdbegin .....***Timeout 284.79 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.367318e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.126512e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.008815e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.121717e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.266739e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.788660e+00 s Time to initialize coeftab 7.235616e+00 s Start 3525: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_svdbegin Test #3526: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_svdend .......***Timeout 283.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.814478e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.679984e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.931857e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 7.310487e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.822622e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.028363e-01 s Time to initialize coeftab 4.683794e-01 s Start 3526: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_svdend Test #3527: mpi_dst_example_simple_lap_z_facto4_sched4_not_pqrcpbegin ...............***Timeout 282.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.336841e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.962886e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.842565e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.611586e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.082838e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.648729e+00 s Time to initialize coeftab 3.699014e+00 s Start 3527: mpi_dst_example_simple_lap_z_facto4_sched4_not_pqrcpbegin Test #3528: mpi_dst_example_simple_lap_z_facto4_sched4_not_pqrcpend .................***Timeout 281.01 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.527402e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.099970e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.994206e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.329707e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.432678e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.427049e+00 s Time to initialize coeftab 8.257256e-01 s Time to factorize 7.365072e+01 s (296.25 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Start 3528: mpi_dst_example_simple_lap_z_facto4_sched4_not_pqrcpend Test #3529: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpbegin ..............***Timeout 280.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 5.071852e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.645863e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.039761e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.249553e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.771904e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 9.174449e+00 s Start 3529: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpbegin Test #3530: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpend ................***Timeout 279.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.098038e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 7.173274e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.471782e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.789646e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.448120e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.141553e+00 s Time to initialize coeftab 8.964043e-01 s Start 3530: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpend Test #3531: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpbegin ...***Timeout 279.75 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.284310e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.978774e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.983473e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.292676e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.123430e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 2.968723e+00 s Time to initialize coeftab 2.137023e+00 s Start 3531: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpbegin Test #3532: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpend .....***Timeout 279.73 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.770060e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.538112e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.028756e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3532: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpend Test #3533: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrcpbegin ...............***Timeout 279.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.566095e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.585885e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.160824e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.998988e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.829191e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.657509e+00 s Time to initialize coeftab 1.326425e+01 s Start 3533: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrcpbegin Test #3534: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrcpend .................***Timeout 279.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.945149e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.826152e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.179059e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.366030e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.879230e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.403491e+00 s Time to initialize coeftab 1.103005e+00 s Start 3534: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrcpend Test #3535: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrcpbegin ..............***Timeout 278.47 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.361523e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.294586e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.939646e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 5.514222e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.239033e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 1.314286e+00 s Time to initialize coeftab 6.026505e+00 s Start 3535: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrcpbegin Test #3536: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrcpend ................***Timeout 277.87 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.279100e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.480911e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.118527e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3536: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrcpend Test #3537: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpbegin ...***Timeout 276.85 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.357200e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.985516e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.944769e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.324837e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.764572e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.680665e+00 s Time to initialize coeftab 1.195397e+01 s Start 3537: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpbegin Test #3538: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpend .....***Timeout 276.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3538: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpend Test #3539: mpi_dst_example_simple_lap_z_facto4_sched4_not_tqrcpbegin ...............***Timeout 276.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.616187e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.315781e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.676977e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.535042e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.521662e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.245920e+00 s Time to initialize coeftab 1.274133e+01 s Start 3539: mpi_dst_example_simple_lap_z_facto4_sched4_not_tqrcpbegin Test #3540: mpi_dst_example_simple_lap_z_facto4_sched4_not_tqrcpend .................***Timeout 275.76 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.173631e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.699604e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.653692e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.164776e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.460023e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 8.073224e+00 s Time to initialize coeftab 1.277002e+00 s Start 3540: mpi_dst_example_simple_lap_z_facto4_sched4_not_tqrcpend Test #3541: mpi_dst_example_simple_lap_z_facto4_sched4_kway_tqrcpbegin ..............***Timeout 275.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.933597e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.077999e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.155381e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3541: mpi_dst_example_simple_lap_z_facto4_sched4_kway_tqrcpbegin Test #3542: mpi_dst_example_simple_lap_z_facto4_sched4_kway_tqrcpend ................***Timeout 274.79 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.461785e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.203926e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.240048e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 8.644378e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.375280e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.718665e+00 s Time to initialize coeftab 8.048158e-01 s Start 3542: mpi_dst_example_simple_lap_z_facto4_sched4_kway_tqrcpend Test #3543: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpbegin ...***Timeout 275.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.128109e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.544952e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.576937e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.300074e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.062368e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 8.836767e+00 s Time to initialize coeftab 1.250233e+01 s Start 3543: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpbegin Test #3544: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpend .....***Timeout 274.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.387930e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.287709e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.855407e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.125794e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.713878e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.801502e+00 s Time to initialize coeftab 1.348038e+00 s Start 3544: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpend Test #3545: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrrtbegin ...............***Timeout 274.62 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.456900e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.176207e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.070175e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 4.359648e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.139590e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.242381e+00 s Time to initialize coeftab 6.294873e+00 s Start 3545: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrrtbegin Test #3546: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrrtend .................***Timeout 273.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.700731e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.731037e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.682011e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.292347e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.774278e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Start 3546: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrrtend Test #3547: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrrtbegin ..............***Timeout 272.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.619253e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.599334e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.731113e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.131613e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.493245e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 4.240003e+00 s Start 3547: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrrtbegin Test #3548: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrrtend ................***Timeout 272.54 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3548: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrrtend Test #3549: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtbegin ...***Timeout 272.32 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.550360e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.874240e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.135631e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 9.220172e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.608983e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 3.067253e+00 s Time to initialize coeftab 9.885245e+00 s Start 3549: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtbegin Test #3550: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtend .....***Timeout 271.75 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.913308e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.394351e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.668118e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.706565e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.781419e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 7.687045e+00 s Start 3550: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtend Test #3551: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpilu0 ...............***Timeout 271.58 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.455154e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 5.501637e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.629035e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.251207e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.846137e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 6.719817e+00 s Time to initialize coeftab 1.054299e+00 s Start 3551: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpilu0 Test #3552: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpilu1 ...............***Timeout 271.07 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.726592e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.300800e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.914377e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^h 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 6.395102e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.627645e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^h Time to initialize internal csc 5.206337e+00 s Time to initialize coeftab 1.374009e+00 s Start 3552: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpilu1 Test #3572: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_s .................................***Timeout 269.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.187077e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.948637e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.599723e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: -- BCSC MatVec Test -- Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.170807e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.922575e+01 s Time to initialize internal csc 2.466597e-03 s -- BCSC MatVec Test -- Case Sequential - Float - Symmetric - NoTrans: -- BCSC MatVec Test -- -- BCSC MatVec Test -- Start 3572: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_s Test #3573: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_d .................................***Timeout 269.00 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.564818e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.725229e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.650279e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: -- BCSC MatVec Test -- -- BCSC MatVec Test -- Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.367252e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.522313e+01 s Time to initialize internal csc 2.653792e-02 s -- BCSC MatVec Test -- Case Sequential - Double - Symmetric - NoTrans: -- BCSC MatVec Test -- Start 3573: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_d Test #3574: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_c .................................***Timeout 268.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.181141e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.456778e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 6.882011e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: -- BCSC MatVec Test -- -- BCSC MatVec Test -- Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.197900e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.908558e+01 s -- BCSC MatVec Test -- Time to initialize internal csc 7.369358e-01 s -- BCSC MatVec Test -- Case Sequential - Complex32 - Hermitian - NoTrans: [ 0] ||A||_inf = 1.200000e+01, ||x||_inf = 6.882302e-01, ||y||_inf = 6.931639e-01 [ 2] ||A||_inf = 1.200000e+01, ||x||_inf = 6.882302e-01, ||y||_inf = 6.931639e-01 [ 2] ||spm(a*A*x+b*y)||_inf = 8.375120e-01, ||bcsc(a*A*x+b*y)||_inf = 8.375120e-01, ||R||_m = 1.228781e-07 [ 0] ||spm(a*A*x+b*y)||_inf = 8.375120e-01, ||bcsc(a*A*x+b*y)||_inf = 8.375120e-01, ||R||_m = 1.228781e-07 [ 3] ||A||_inf = 1.200000e+01, ||x||_inf = 6.882302e-01, ||y||_inf = 6.931639e-01 [ 3] ||spm(a*A*x+b*y)||_inf = 8.375120e-01, ||bcsc(a*A*x+b*y)||_inf = 8.375120e-01, ||R||_m = 1.228781e-07 [ 1] ||A||_inf = 1.200000e+01, ||x||_inf = 6.882302e-01, ||y||_inf = 6.931639e-01 [ 1] ||spm(a*A*x+b*y)||_inf = 8.375120e-01, ||bcsc(a*A*x+b*y)||_inf = 8.375120e-01, ||R||_m = 1.228781e-07 Case Sequential - Complex32 - Hermitian - NoTrans: SUCCESS Case Sequential - Complex32 - Hermitian - Trans: Start 3574: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_c Test #3575: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_z .................................***Timeout 267.50 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.160296e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.883487e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.329900e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: -- BCSC MatVec Test -- Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.461400e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.882846e+01 s -- BCSC MatVec Test -- Time to initialize internal csc 9.468216e-01 s -- BCSC MatVec Test -- Case Sequential - Complex64 - Hermitian - NoTrans: -- BCSC MatVec Test -- [ 0] ||A||_inf = 1.200000e+01, ||x||_inf = 6.882302e-01, ||y||_inf = 6.931638e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 8.375120e-01, ||bcsc(a*A*x+b*y)||_inf = 8.375120e-01, ||R||_m = 0.000000e+00 [ 1] ||A||_inf = 1.200000e+01, ||x||_inf = 6.882302e-01, ||y||_inf = 6.931638e-01 [ 2] ||A||_inf = 1.200000e+01, ||x||_inf = 6.882302e-01, ||y||_inf = 6.931638e-01 [ 2] ||spm(a*A*x+b*y)||_inf = 8.375120e-01, ||bcsc(a*A*x+b*y)||_inf = 8.375120e-01, ||R||_m = 0.000000e+00 [ 1] ||spm(a*A*x+b*y)||_inf = 8.375120e-01, ||bcsc(a*A*x+b*y)||_inf = 8.375120e-01, ||R||_m = 0.000000e+00 [ 3] ||A||_inf = 1.200000e+01, ||x||_inf = 6.882302e-01, ||y||_inf = 6.931638e-01 [ 3] ||spm(a*A*x+b*y)||_inf = 8.375120e-01, ||bcsc(a*A*x+b*y)||_inf = 8.375120e-01, ||R||_m = 0.000000e+00 Case Sequential - Complex64 - Hermitian - NoTrans: SUCCESS Case Sequential - Complex64 - Hermitian - Trans: [ 2] ||A||_inf = 1.200000e+01, ||x||_inf = 6.882302e-01, ||y||_inf = 6.931638e-01 [ 1] ||A||_inf = 1.200000e+01, ||x||_inf = 6.882302e-01, ||y||_inf = 6.931638e-01 [ 1] ||spm(a*A*x+b*y)||_inf = 8.375120e-01, ||bcsc(a*A*x+b*y)||_inf = 8.375120e-01, ||R||_m = 0.000000e+00 [ 0] ||A||_inf = 1.200000e+01, ||x||_inf = 6.882302e-01, ||y||_inf = 6.931638e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 8.375120e-01, ||bcsc(a*A*x+b*y)||_inf = 8.375120e-01, ||R||_m = 0.000000e+00 [ 2] ||spm(a*A*x+b*y)||_inf = 8.375120e-01, ||bcsc(a*A*x+b*y)||_inf = 8.375120e-01, ||R||_m = 0.000000e+00 [ 3] ||A||_inf = 1.200000e+01, ||x||_inf = 6.882302e-01, ||y||_inf = 6.931638e-01 [ 3] ||spm(a*A*x+b*y)||_inf = 8.375120e-01, ||bcsc(a*A*x+b*y)||_inf = 8.375120e-01, ||R||_m = 0.000000e+00 Case Sequential - Complex64 - Hermitian - Trans: SUCCESS Case Sequential - Complex64 - Hermitian - ConjTrans: Start 3575: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_z Test #3576: bcsc_mpi_rep_test_bcsc_spmv_tests_rsa ...................................***Timeout 266.76 sec RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3576: bcsc_mpi_rep_test_bcsc_spmv_tests_rsa Test #3577: bcsc_mpi_rep_test_bcsc_spmv_tests_mm ....................................***Timeout 266.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.010928e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 24350 Fill-in of L 9.878296 Time to compute symbol matrix 1.454164e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.944589e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: -- BCSC MatVec Test -- Number of non-zeroes in blocked L 48700 Fill-in 19.756592 Number of operations in full-rank: LU 6.45 MFlops Prediction: Model AMD 6180 MKL Time to factorize 4.003593e-04 s Time for mapping/scheduling 1.711437e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.875209e+01 s Time to initialize internal csc 7.219921e-01 s -- BCSC MatVec Test -- Case Sequential - Complex64 - Symmetric - NoTrans: -- BCSC MatVec Test -- -- BCSC MatVec Test -- [ 1] ||A||_inf = 4.744600e+02, ||x||_inf = 6.850952e-01, ||y||_inf = 6.931638e-01 [ 1] ||spm(a*A*x+b*y)||_inf = 2.909097e+01, ||bcsc(a*A*x+b*y)||_inf = 2.909097e+01, ||R||_m = 3.552714e-15 [ 0] ||A||_inf = 4.744600e+02, ||x||_inf = 6.850952e-01, ||y||_inf = 6.931638e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 2.909097e+01, ||bcsc(a*A*x+b*y)||_inf = 2.909097e+01, ||R||_m = 3.552714e-15 [ 3] ||A||_inf = 4.744600e+02, ||x||_inf = 6.850952e-01, ||y||_inf = 6.931638e-01 [ 3] ||spm(a*A*x+b*y)||_inf = 2.909097e+01, ||bcsc(a*A*x+b*y)||_inf = 2.909097e+01, ||R||_m = 3.552714e-15 [ 2] ||A||_inf = 4.744600e+02, ||x||_inf = 6.850952e-01, ||y||_inf = 6.931638e-01 [ 2] ||spm(a*A*x+b*y)||_inf = 2.909097e+01, ||bcsc(a*A*x+b*y)||_inf = 2.909097e+01, ||R||_m = 3.552714e-15 Case Sequential - Complex64 - Symmetric - NoTrans: SUCCESS Case Sequential - Complex64 - Symmetric - Trans: Start 3577: bcsc_mpi_rep_test_bcsc_spmv_tests_mm Test #3579: bcsc_mpi_rep_test_bcsc_spmv_tests_mm2 ...................................***Timeout 265.78 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.475559e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 10749 Fill-in of L 0.893590 Time to compute symbol matrix 1.802389e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.334526e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3579: bcsc_mpi_rep_test_bcsc_spmv_tests_mm2 Test #3580: bcsc_mpi_rep_test_bcsc_spmv_time_lap_s ..................................***Timeout 264.49 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Start 3580: bcsc_mpi_rep_test_bcsc_spmv_time_lap_s Test #3581: bcsc_mpi_rep_test_bcsc_spmv_time_lap_d ..................................***Timeout 263.72 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 3581: bcsc_mpi_rep_test_bcsc_spmv_time_lap_d Test #3582: bcsc_mpi_rep_test_bcsc_spmv_time_lap_c ..................................***Timeout 263.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Start 3582: bcsc_mpi_rep_test_bcsc_spmv_time_lap_c Test #3583: bcsc_mpi_rep_test_bcsc_spmv_time_lap_z ..................................***Timeout 262.21 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Start 3583: bcsc_mpi_rep_test_bcsc_spmv_time_lap_z Test #3584: bcsc_mpi_rep_test_bcsc_spmv_time_rsa ....................................***Timeout 261.55 sec RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Start 3584: bcsc_mpi_rep_test_bcsc_spmv_time_rsa Test #3585: bcsc_mpi_rep_test_bcsc_spmv_time_mm .....................................***Timeout 260.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Complex64 Format: IJV N: 841 nnz: 2465 Start 3585: bcsc_mpi_rep_test_bcsc_spmv_time_mm Test #3586: bcsc_mpi_rep_test_bcsc_spmv_time_hb .....................................***Timeout 260.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 Start 3586: bcsc_mpi_rep_test_bcsc_spmv_time_hb Test #3587: bcsc_mpi_rep_test_bcsc_spmv_time_mm2 ....................................***Timeout 259.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: IJV N: 1280 nnz: 12029 Start 3587: bcsc_mpi_rep_test_bcsc_spmv_time_mm2 Test #3589: bcsc_mpi_rep_test_bvec_tests ............................................***Timeout 259.29 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3589: bcsc_mpi_rep_test_bvec_tests Test #3591: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_s .................................***Timeout 257.73 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.819369e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.421430e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.248058e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.545534e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.307723e+01 s Time to initialize internal csc 8.139348e+00 s Start 3591: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_s Test #3592: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_d .................................***Timeout 256.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.442086e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.991952e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.291545e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3592: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_d Test #3593: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_c .................................***Timeout 256.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.345758e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.506820e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.648059e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3593: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_c Test #3594: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_z .................................***Timeout 256.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.234563e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.081385e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.466285e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3594: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_z Test #3595: bcsc_mpi_dst_test_bcsc_spmv_tests_rsa ...................................***Timeout 255.69 sec RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3595: bcsc_mpi_dst_test_bcsc_spmv_tests_rsa Test #3596: bcsc_mpi_dst_test_bcsc_spmv_tests_mm ....................................***Timeout 255.67 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.218955e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 24350 Fill-in of L 9.878296 Time to compute symbol matrix 1.359959e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.827788e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 48700 Fill-in 19.756592 Number of operations in full-rank: LU 6.45 MFlops Prediction: Model AMD 6180 MKL Time to factorize 4.003593e-04 s Time for mapping/scheduling 1.209826e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.702626e+01 s Time to initialize internal csc 3.944466e+00 s -- BCSC MatVec Test -- Case Sequential - Complex64 - Symmetric - NoTrans: -- BCSC MatVec Test -- -- BCSC MatVec Test -- -- BCSC MatVec Test -- [ 0] ||A||_inf = 4.744600e+02, ||x||_inf = 1.000000e+00, ||y||_inf = 1.000000e+00 [ 0] ||spm(a*A*x+b*y)||_inf = 2.909097e+01, ||bcsc(a*A*x+b*y)||_inf = 2.909097e+01, ||R||_m = 3.552714e-15 [ 3] ||A||_inf = 4.744600e+02, ||x||_inf = 6.709878e+02, ||y||_inf = 5.179176e+00 [ 3] ||spm(a*A*x+b*y)||_inf = 8.356504e+307, ||bcsc(a*A*x+b*y)||_inf = 2.909097e+01, ||R||_m = 3.552714e-15 [ 2] ||A||_inf = 4.744600e+02, ||x||_inf = 6.709878e+02, ||y||_inf = 5.140403e+00 [ 2] ||spm(a*A*x+b*y)||_inf = 2.792975e+01, ||bcsc(a*A*x+b*y)||_inf = 2.909097e+01, ||R||_m = 3.552714e-15 [ 1] ||A||_inf = 4.744600e+02, ||x||_inf = 6.709878e+02, ||y||_inf = 5.422490e+00 [ 1] ||spm(a*A*x+b*y)||_inf = 1.093140e+02, ||bcsc(a*A*x+b*y)||_inf = 2.909097e+01, ||R||_m = 3.552714e-15 Case Sequential - Complex64 - Symmetric - NoTrans: SUCCESS Case Sequential - Complex64 - Symmetric - Trans: Start 3596: bcsc_mpi_dst_test_bcsc_spmv_tests_mm Test #3598: bcsc_mpi_dst_test_bcsc_spmv_tests_mm2 ...................................***Timeout 256.37 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.437616e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 10749 Fill-in of L 0.893590 Time to compute symbol matrix 3.066296e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.636354e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 21498 Fill-in 1.787181 Number of operations in full-rank: LU 1.08 MFlops Prediction: Model AMD 6180 MKL Time to factorize 8.513996e-04 s Time for mapping/scheduling 2.946826e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.301656e+01 s -- BCSC MatVec Test -- -- BCSC MatVec Test -- Time to initialize internal csc 5.674207e+00 s -- BCSC MatVec Test -- Case Sequential - Complex64 - Hermitian - NoTrans: -- BCSC MatVec Test -- Start 3598: bcsc_mpi_dst_test_bcsc_spmv_tests_mm2 Test #3599: bcsc_mpi_dst_test_bcsc_spmv_time_lap_s ..................................***Timeout 255.97 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 Start 3599: bcsc_mpi_dst_test_bcsc_spmv_time_lap_s Test #3600: bcsc_mpi_dst_test_bcsc_spmv_time_lap_d ..................................***Timeout 255.53 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 Start 3600: bcsc_mpi_dst_test_bcsc_spmv_time_lap_d Test #3601: bcsc_mpi_dst_test_bcsc_spmv_time_lap_c ..................................***Timeout 255.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 Start 3601: bcsc_mpi_dst_test_bcsc_spmv_time_lap_c Test #3602: bcsc_mpi_dst_test_bcsc_spmv_time_lap_z ..................................***Timeout 255.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 Start 3602: bcsc_mpi_dst_test_bcsc_spmv_time_lap_z Test #3603: bcsc_mpi_dst_test_bcsc_spmv_time_rsa ....................................***Timeout 255.51 sec RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 12111 nnz: 40537 Details: N nnz 0: 3028 20312 1: 3028 15614 2: 3028 4611 3: 3027 0 Start 3603: bcsc_mpi_dst_test_bcsc_spmv_time_rsa Test #3604: bcsc_mpi_dst_test_bcsc_spmv_time_mm .....................................***Timeout 255.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Complex64 Format: IJV N: 841 nnz: 2465 Details: N nnz 0: 211 626 1: 210 623 2: 210 623 3: 210 593 Start 3604: bcsc_mpi_dst_test_bcsc_spmv_time_mm Test #3605: bcsc_mpi_dst_test_bcsc_spmv_time_hb .....................................***Timeout 255.16 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 Details: N nnz 0: 258 1740 1: 258 1636 2: 257 1862 3: 257 1620 Start 3605: bcsc_mpi_dst_test_bcsc_spmv_time_hb Test #3606: bcsc_mpi_dst_test_bcsc_spmv_time_mm2 ....................................***Timeout 255.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: IJV N: 1280 nnz: 12029 Details: N nnz 0: 320 2957 1: 320 3080 2: 320 3080 3: 320 2912 Start 3606: bcsc_mpi_dst_test_bcsc_spmv_time_mm2 Test #3607: bcsc_mpi_dst_test_bvec_tests ............................................***Timeout 255.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3607: bcsc_mpi_dst_test_bvec_tests Test #3608: bcsc_mpi_dst_test_bvec_applyorder_tests .................................***Timeout 256.04 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Check b == (P^t (P b)) / Case Single DoF - Distributed - Float - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Distributed - Double - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Distributed - Complex32 - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Distributed - Complex64 - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Distributed - Float - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Distributed - Double - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Distributed - Complex32 - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Distributed - Complex64 - 5 nrhs: SUCCESS Start 3608: bcsc_mpi_dst_test_bvec_applyorder_tests Test #3610: fortran_mpi_fsimple .....................................................***Timeout 255.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.477748e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.778071e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.989154e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.742613e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.909236e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.527290e-03 s Time to initialize coeftab 1.248989e+00 s Start 3610: fortran_mpi_fsimple Test #3612: fortran_mpi_flaplacian ..................................................***Timeout 255.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.938828e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.900351e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.671549e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.290856e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.569340e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.830571e-02 s Time to initialize coeftab 1.875922e+00 s Start 3612: fortran_mpi_flaplacian Test #3614: fortran_mpi_fstep-by-step ...............................................***Timeout 255.65 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3614: fortran_mpi_fstep-by-step Test #3616: fortran_mpi_fmultidof ...................................................***Timeout 254.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Dof: 5 N expanded: 5000 NNZ expanded: 92500 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.288070e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 1410450 Fill-in of L 381.202703 Time to compute symbol matrix 3.410682e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.161175e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Start 3616: fortran_mpi_fmultidof Test #3618: fortran_mpi_fusermat_csr ................................................***Timeout 253.52 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 64 nnz: 208 Details: N nnz 0: 16 56 1: 16 56 2: 16 56 3: 16 40 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 8.650001e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 1198 Fill-in of L 5.759615 Time to compute symbol matrix 8.760157e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.296447e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 2396 Fill-in 11.519231 Number of operations in full-rank: LU 55.06 KFlops Prediction: Model AMD 6180 MKL Time to factorize 1.354808e-04 s Time for mapping/scheduling 1.129017e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.352740e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.789290e+00 s Time to initialize coeftab 7.755650e-01 s Start 3618: fortran_mpi_fusermat_csr Test #3619: fortran_shm_fmultilap_seq ...............................................***Timeout 252.52 sec !--------------------------------------------------------------------! ! Multiple Laplacian testing configuration ! !--------------------------------------------------------------------! Nb of threads = 5 Nb of PaStiX instances = 1 Nb of outer iterations = 2 Nb of distinct matrices = 2 Nb of RHS to solve per matrix = 10 Nb of solve phase to perform per RHS = 2 Size of each matrix = 1000 ( 10 x 10 x 10 ) Nbr of non zero entries per matrix = 3700 The multirhs mode is disabled !--------------------------------------------------------------------! Size of x = 1000 Matrix NRHS Start End 1 2 1 2 1 2 3 4 1 2 5 6 1 2 7 8 1 2 9 10 2 2 1 2 2 2 3 4 2 2 5 6 2 2 7 8 2 2 9 10 !====================================================================! Outer iteration: 1 !--------------------------------------------------------------------! Nb of factorization performed in parallel = 1 Nb of threads used by PaStiX per factorization = 5 Start 3619: fortran_shm_fmultilap_seq Test #3620: fortran_shm_fmultilap_mt ................................................***Timeout 252.11 sec !--------------------------------------------------------------------! ! Multiple Laplacian testing configuration ! !--------------------------------------------------------------------! Nb of threads = 10 Nb of PaStiX instances = 2 Nb of outer iterations = 2 Nb of distinct matrices = 4 Nb of RHS to solve per matrix = 6 Nb of solve phase to perform per RHS = 3 Size of each matrix = 1000 ( 10 x 10 x 10 ) Nbr of non zero entries per matrix = 3700 The multirhs mode is enabled !--------------------------------------------------------------------! Size of x = 1000 Matrix NRHS Start End 1 6 1 6 2 6 1 6 3 6 1 6 4 6 1 6 !====================================================================! Outer iteration: 1 !--------------------------------------------------------------------! Nb of factorization performed in parallel = 2 Nb of threads used by PaStiX per factorization = 5 !--------------------------------------------------------------------! ! Results per matrix ! Matrix 1 done by instance 1 Time for analysis 5.8871461610000004 Pred Time for fact 8.4819232934860015E-004 Time for factorization 16.707931027000001 GFlops/s for fact 1.1039883710189822E-012 Matrix 2 done by instance 1 Time for analysis 11.284121769000002 Pred Time for fact 8.4819232934860015E-004 Time for factorization 32.375658553999997 GFlops/s for fact 5.6972930841946770E-013 Matrix 3 done by instance 2 Time for analysis 5.9392496189999981 Pred Time for fact 8.4819232934860015E-004 Time for factorization 16.042131826000002 GFlops/s for fact 1.1498073795716007E-012 Matrix 4 done by instance 2 Time for analysis 10.599911290999998 Pred Time for fact 8.4819232934860015E-004 Time for factorization 27.489519645000001 GFlops/s for fact 6.7099613946692671E-013 !--------------------------------------------------------------------! ! Results per PaStiX instance ! Thread 1 Time for analysis 17.171267930000003 Time for factorization 49.083589580999998 GFlops/s for fact 8.3685883971922488E-013 Thread 2 Time for analysis 16.539160909999996 Time for factorization 43.531651471000004 GFlops/s for fact 9.1040175951926359E-013 !--------------------------------------------------------------------! Outer iterate 1 Time for analysis 17.171267930000003 Time for factorization 49.083589580999998 GFlops/s for fact 8.7363029961924424E-013 !--------------------------------------------------------------------! Nb of OpenMP threads enrolled for solution = 2 Nb of pthreads used in PaStiX for solution = 5 !--------------------------------------------------------------------! Solve iteration nr. 1 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one !--------------------------------------------------------------------! Check results for system 1 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 || A ||_1 1.350000e+01 max(|| b_i ||_oo) 8.458005e+00 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.401375e-16 max(|| b_i - A x_i ||_1) 2.382554e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.601299e+00 (SUCCESS) !--------------------------------------------------------------------! Check results for system 2 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 || A ||_1 2.700000e+01 max(|| b_i ||_oo) 1.691601e+01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.314068e-16 max(|| b_i - A x_i ||_1) 2.434351e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.306355e+00 (SUCCESS) !--------------------------------------------------------------------! Check results for system 3 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 || A ||_1 4.050000e+01 max(|| b_i ||_oo) 2.537402e+01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.411908e-16 max(|| b_i - A x_i ||_1) 2.174164e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 4.444348e+00 (SUCCESS) !--------------------------------------------------------------------! Check results for system 4 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 || A ||_1 5.400000e+01 max(|| b_i ||_oo) 3.383202e+01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.402552e-16 max(|| b_i - A x_i ||_1) 2.377240e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.390910e+00 (SUCCESS) !--------------------------------------------------------------------! Thread 1 Time for solution 20.826966326000004 Thread 2 Time for solution 24.432067591000006 !--------------------------------------------------------------------! Solve iteration nr. 2 Start 3620: fortran_shm_fmultilap_mt Test #3622: python_mpi_simple .......................................................***Timeout 251.07 sec Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Start 3622: python_mpi_simple Test #3624: python_mpi_step-by-step .................................................***Timeout 250.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 125 nnz: 425 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.240333e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 3474 Fill-in of L 8.174118 Time to compute symbol matrix 1.418995e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.876120e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 6948 Fill-in 16.348235 Number of operations in full-rank: LU 225.24 KFlops Prediction: Model AMD 6180 MKL Time to factorize 1.668945e-04 s Time for mapping/scheduling 1.345415e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.742516e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.228165e-03 s Time to initialize coeftab 4.172539e-01 s Start 3624: python_mpi_step-by-step Test #3626: python_mpi_simple_obj ...................................................***Timeout 249.09 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: General Arithmetic: Double Format: CSC N: 1000 nnz: 4992 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Start 3626: python_mpi_simple_obj Test #3613: fortran_shm_fstep-by-step ...............................................***Timeout 248.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.173615e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.664305e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.846985e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 5.888781e+00 s 8 11 14 4 893 278 283 136 145 139 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Start 3613: fortran_shm_fstep-by-step Test #3615: fortran_shm_fmultidof ...................................................***Timeout 248.37 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Dof: 5 N expanded: 5000 NNZ expanded: 92500 Start 3615: fortran_shm_fmultidof Test #3623: python_shm_step-by-step .................................................***Timeout 247.48 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 125 nnz: 425 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 6.614860e+00 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 3474 Fill-in of L 8.174118 Time to compute symbol matrix 1.309546e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.185584e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 6948 Fill-in 16.348235 Number of operations in full-rank: LU 225.24 KFlops Prediction: Model AMD 6180 MKL Time to factorize 1.798249e-04 s Time for mapping/scheduling 9.191475e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 7.865045e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.314595e-02 s Time to initialize coeftab 8.595868e-02 s Time to factorize 2.222163e+00 s (101.36 KFlop/s) Number of operations 174.12 KFlops Number of static pivots 0 Memory usage of coeftab 74 Ko Time to solve 5.548424e+00 s Time for refinement 1.375025e+01 s || A ||_1 1.367598e-01 max(|| b_i ||_oo) 5.796323e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.349022e-16 max(|| b_i - A x_i ||_1) 1.535757e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.853865e+00 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 2.379227e+00 s Time for refinement 1.206445e+01 s || A ||_1 1.367598e-01 max(|| b_i ||_oo) 5.796323e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.258991e-16 max(|| b_i - A x_i ||_1) 1.512863e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.791453e+00 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.169880e-04 s Time to initialize coeftab 1.174618e-01 s Time to factorize 8.634127e-01 s (260.87 KFlop/s) Number of operations 174.12 KFlops Number of static pivots 0 Memory usage of coeftab 74 Ko Time to solve 1.578388e+00 s Time for refinement 1.052625e+01 s || A ||_1 1.367598e-01 max(|| b_i ||_oo) 5.796323e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.259827e-16 max(|| b_i - A x_i ||_1) 1.515138e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.816887e+00 (SUCCESS) WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one Time to solve 1.751150e+00 s Time for refinement 1.136076e+01 s || A ||_1 1.367598e-01 max(|| b_i ||_oo) 5.796323e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.355357e-16 max(|| b_i - A x_i ||_1) 1.550680e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.817713e+00 (SUCCESS) Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Start 3623: python_shm_step-by-step Test #3442: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpend .....***Timeout 203.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.299578e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.366323e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 5.420624e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.044154e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.927994e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.406638e+00 s Time to initialize coeftab 4.631181e-01 s Time to factorize 6.864364e+01 s (317.86 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 7.729999e+00 s - iteration 1 : total iteration time 7.43 s error 6.8943e-18 Time for refinement 1.302465e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.285477e-16 max(|| b_i - A x_i ||_1) 6.451046e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.627818e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.285477e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.285477e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 1.285477e-16 max(|| b_i - A x_i ||_1) 6.451046e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.627818e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 6.451046e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.627818e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 6.451046e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.627818e-03 (SUCCESS) Test #3448: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpend .....***Timeout 200.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.253897e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.811539e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.012543e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.334637e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.772844e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.673542e+00 s Time to initialize coeftab 1.016308e+00 s Time to factorize 4.814994e+01 s (453.15 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 7.260031e+00 s - iteration 1 : total iteration time 12.1 s error 2.6248e-16 Time for refinement 2.039209e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.849886e-16 max(|| b_i - A x_i ||_1) 8.095970e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.042888e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.849886e-16 max(|| b_i - A x_i ||_1) 8.095970e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.042888e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.849886e-16 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.849886e-16 max(|| b_i - A x_i ||_1) 8.095970e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.042888e-03 (SUCCESS) max(|| b_i - A x_i ||_1) 8.095970e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.042888e-03 (SUCCESS) Test #3554: bcsc_shm_test_bcsc_spmv_tests_lap_d .....................................***Timeout 178.38 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.015753e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.594075e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.785202e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 4.820749e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.165557e+01 s Time to initialize internal csc 6.812134e-01 s -- BCSC MatVec Test -- Case Sequential - Double - Symmetric - NoTrans: [ 0] ||A||_inf = 1.200000e+01, ||x||_inf = 5.000000e-01, ||y||_inf = 5.000000e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 1.468726e+00, ||bcsc(a*A*x+b*y)||_inf = 1.468726e+00, ||R||_m = 0.000000e+00 Case Sequential - Double - Symmetric - NoTrans: SUCCESS Case Sequential - Double - Symmetric - Trans: [ 0] ||A||_inf = 1.200000e+01, ||x||_inf = 5.000000e-01, ||y||_inf = 5.000000e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 1.468726e+00, ||bcsc(a*A*x+b*y)||_inf = 1.468726e+00, ||R||_m = 0.000000e+00 Case Sequential - Double - Symmetric - Trans: SUCCESS Case Static - Double - Symmetric - NoTrans: [ 0] ||A||_inf = 1.200000e+01, ||x||_inf = 5.000000e-01, ||y||_inf = 5.000000e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 1.468726e+00, ||bcsc(a*A*x+b*y)||_inf = 1.468726e+00, ||R||_m = 0.000000e+00 Case Static - Double - Symmetric - NoTrans: SUCCESS Case Static - Double - Symmetric - Trans: [ 0] ||A||_inf = 1.200000e+01, ||x||_inf = 5.000000e-01, ||y||_inf = 5.000000e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 1.468726e+00, ||bcsc(a*A*x+b*y)||_inf = 1.468726e+00, ||R||_m = 0.000000e+00 Case Static - Double - Symmetric - Trans: SUCCESS -- All tests PASSED -- Test #3557: bcsc_shm_test_bcsc_spmv_tests_rsa .......................................***Timeout 177.66 sec RSA driver is no longer supported and is replaced by the HB driver ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.964561e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 1607873 Fill-in of L 39.664331 Time to compute symbol matrix 7.795920e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.381041e+01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 3215746 Fill-in 79.328663 Number of operations in full-rank: LU 1.24 GFlops Prediction: Model AMD 6180 MKL Time to factorize 1.279898e-01 s Time for mapping/scheduling 4.675197e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.125148e+02 s Time to initialize internal csc 1.964479e-02 s -- BCSC MatVec Test -- Case Sequential - Double - Symmetric - NoTrans: [ 0] ||A||_inf = 7.580208e+12, ||x||_inf = 5.000000e-01, ||y||_inf = 5.000000e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 7.901221e+11, ||bcsc(a*A*x+b*y)||_inf = 7.901221e+11, ||R||_m = 1.220703e-04 Case Sequential - Double - Symmetric - NoTrans: SUCCESS Case Sequential - Double - Symmetric - Trans: [ 0] ||A||_inf = 7.580208e+12, ||x||_inf = 5.000000e-01, ||y||_inf = 5.000000e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 7.901221e+11, ||bcsc(a*A*x+b*y)||_inf = 7.901221e+11, ||R||_m = 1.220703e-04 Case Sequential - Double - Symmetric - Trans: SUCCESS Case Static - Double - Symmetric - NoTrans: [ 0] ||A||_inf = 7.580208e+12, ||x||_inf = 5.000000e-01, ||y||_inf = 5.000000e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 7.901221e+11, ||bcsc(a*A*x+b*y)||_inf = 7.901221e+11, ||R||_m = 1.220703e-04 Case Static - Double - Symmetric - NoTrans: SUCCESS Case Static - Double - Symmetric - Trans: [ 0] ||A||_inf = 7.580208e+12, ||x||_inf = 5.000000e-01, ||y||_inf = 5.000000e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 7.901221e+11, ||bcsc(a*A*x+b*y)||_inf = 7.901221e+11, ||R||_m = 1.220703e-04 Case Static - Double - Symmetric - Trans: SUCCESS -- All tests PASSED -- Test #3558: bcsc_shm_test_bcsc_spmv_tests_mm ........................................***Timeout 177.65 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.342111e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 24350 Fill-in of L 9.878296 Time to compute symbol matrix 7.061339e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.206053e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 48700 Fill-in 19.756592 Number of operations in full-rank: LU 6.45 MFlops Prediction: Model AMD 6180 MKL Time to factorize 3.848488e-04 s Time for mapping/scheduling 3.924532e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 1.454871e+01 s Time to initialize internal csc 4.547729e-03 s -- BCSC MatVec Test -- Case Sequential - Complex64 - Symmetric - NoTrans: [ 0] ||A||_inf = 4.744600e+02, ||x||_inf = 6.850952e-01, ||y||_inf = 6.931638e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 2.909097e+01, ||bcsc(a*A*x+b*y)||_inf = 2.909097e+01, ||R||_m = 3.552714e-15 Case Sequential - Complex64 - Symmetric - NoTrans: SUCCESS Case Sequential - Complex64 - Symmetric - Trans: [ 0] ||A||_inf = 4.744600e+02, ||x||_inf = 6.850952e-01, ||y||_inf = 6.931638e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 2.909097e+01, ||bcsc(a*A*x+b*y)||_inf = 2.909097e+01, ||R||_m = 3.552714e-15 Case Sequential - Complex64 - Symmetric - Trans: SUCCESS Case Sequential - Complex64 - Symmetric - ConjTrans: [ 0] ||A||_inf = 4.744600e+02, ||x||_inf = 6.850952e-01, ||y||_inf = 6.931638e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 2.909097e+01, ||bcsc(a*A*x+b*y)||_inf = 2.909097e+01, ||R||_m = 3.552714e-15 Case Sequential - Complex64 - Symmetric - ConjTrans: SUCCESS Case Static - Complex64 - Symmetric - NoTrans: [ 0] ||A||_inf = 4.744600e+02, ||x||_inf = 6.850952e-01, ||y||_inf = 6.931638e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 2.909097e+01, ||bcsc(a*A*x+b*y)||_inf = 2.909097e+01, ||R||_m = 3.552714e-15 Case Static - Complex64 - Symmetric - NoTrans: SUCCESS Case Static - Complex64 - Symmetric - Trans: [ 0] ||A||_inf = 4.744600e+02, ||x||_inf = 6.850952e-01, ||y||_inf = 6.931638e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 2.909097e+01, ||bcsc(a*A*x+b*y)||_inf = 2.909097e+01, ||R||_m = 3.552714e-15 Case Static - Complex64 - Symmetric - Trans: SUCCESS Case Static - Complex64 - Symmetric - ConjTrans: [ 0] ||A||_inf = 4.744600e+02, ||x||_inf = 6.850952e-01, ||y||_inf = 6.931638e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 2.909097e+01, ||bcsc(a*A*x+b*y)||_inf = 2.909097e+01, ||R||_m = 3.552714e-15 Case Static - Complex64 - Symmetric - ConjTrans: SUCCESS -- All tests PASSED -- Test #3559: bcsc_shm_test_bcsc_spmv_tests_hb ........................................***Timeout 175.94 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.359080e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 51109 Fill-in of L 7.452464 Time to compute symbol matrix 7.505576e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.671913e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 102218 Fill-in 14.904929 Number of operations in full-rank: LU 5.50 MFlops Prediction: Model AMD 6180 MKL Time to factorize 7.121319e-04 s Time for mapping/scheduling 3.597565e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.423091e+01 s Time to initialize internal csc 6.173834e-03 s -- BCSC MatVec Test -- Case Sequential - Double - General - NoTrans: [ 0] ||A||_inf = 5.350392e+05, ||x||_inf = 5.000000e-01, ||y||_inf = 5.000000e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 5.249351e+04, ||bcsc(a*A*x+b*y)||_inf = 5.249351e+04, ||R||_m = 7.275958e-12 Case Sequential - Double - General - NoTrans: SUCCESS Case Sequential - Double - General - Trans: [ 0] ||A||_inf = 5.682954e+05, ||x||_inf = 5.000000e-01, ||y||_inf = 5.000000e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 5.045035e+04, ||bcsc(a*A*x+b*y)||_inf = 5.045035e+04, ||R||_m = 7.275958e-12 Case Sequential - Double - General - Trans: SUCCESS Case Static - Double - General - NoTrans: [ 0] ||A||_inf = 5.350392e+05, ||x||_inf = 5.000000e-01, ||y||_inf = 5.000000e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 5.249351e+04, ||bcsc(a*A*x+b*y)||_inf = 5.249351e+04, ||R||_m = 7.275958e-12 Case Static - Double - General - NoTrans: SUCCESS Case Static - Double - General - Trans: [ 0] ||A||_inf = 5.682954e+05, ||x||_inf = 5.000000e-01, ||y||_inf = 5.000000e-01 [ 0] ||spm(a*A*x+b*y)||_inf = 5.045035e+04, ||bcsc(a*A*x+b*y)||_inf = 5.045035e+04, ||R||_m = 7.275958e-12 Case Static - Double - General - Trans: SUCCESS -- All tests PASSED -- Test #3561: bcsc_shm_test_bcsc_spmv_time_lap_s ......................................***Timeout 174.61 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Float Format: CSC N: 1000 nnz: 3700 Time for zspmv ( n=1000; nnz=3700 ) : 2.350883e-01 s ( 31 KFlop/s) Test #3562: bcsc_shm_test_bcsc_spmv_time_lap_d ......................................***Timeout 173.87 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 Time for zspmv ( n=1000; nnz=3700 ) : 2.211479e-01 s ( 33 KFlop/s) Test #3563: bcsc_shm_test_bcsc_spmv_time_lap_c ......................................***Timeout 172.40 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex32 Format: CSC N: 1000 nnz: 3700 Time for zspmv ( n=1000; nnz=3700 ) : 2.501285e-01 s ( 1.2e+02 KFlop/s) Test #3564: bcsc_shm_test_bcsc_spmv_time_lap_z ......................................***Timeout 172.36 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Time for zspmv ( n=1000; nnz=3700 ) : 2.264009e-01 s ( 1.3e+02 KFlop/s) Test #3566: bcsc_shm_test_bcsc_spmv_time_mm .........................................***Timeout 171.55 sec ischedInit: The thread number has been automatically set to 256 Matrix type: Symmetric Arithmetic: Complex64 Format: IJV N: 841 nnz: 2465 Time for zspmv ( n=841; nnz=2465 ) : 2.379686e-01 s ( 81 KFlop/s) Test #3567: bcsc_shm_test_bcsc_spmv_time_hb .........................................***Timeout 169.05 sec ischedInit: The thread number has been automatically set to 256 Matrix type: General Arithmetic: Double Format: CSC N: 1030 nnz: 6858 Time for zspmv ( n=1030; nnz=6858 ) : 1.756852e-01 s ( 76 KFlop/s) Test #3569: bcsc_shm_test_bvec_gemv_tests ...........................................***Timeout 168.26 sec ischedInit: The thread number has been automatically set to 256 Case Float - Sequential: SUCCESS Case Double - Sequential: SUCCESS Case Complex32 - Sequential: SUCCESS Case Complex64 - Sequential: SUCCESS Case Float - Static: SUCCESS Case Double - Static: SUCCESS Case Complex32 - Static: SUCCESS Case Complex64 - Static: SUCCESS -- All tests PASSED -- Test #3590: bcsc_mpi_rep_test_bvec_applyorder_tests .................................***Timeout 167.57 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Check b == (P^t (P b)) / Case Single DoF - Replicated - Float - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Replicated - Double - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Replicated - Complex32 - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Replicated - Complex64 - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Replicated - Float - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Replicated - Double - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Replicated - Complex32 - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Single DoF - Replicated - Complex64 - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Float - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Double - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Complex32 - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Complex64 - 1 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Float - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Double - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Complex32 - 5 nrhs: SUCCESS Check b == (P^t (P b)) / Case Constant Multi DoF - Replicated - Complex64 - 5 nrhs: SUCCESS -- All tests PASSED -- Test #3597: bcsc_mpi_dst_test_bcsc_spmv_tests_hb ....................................***Timeout 167.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.685867e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 51109 Fill-in of L 7.452464 Time to compute symbol matrix 1.002409e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.972753e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 102218 Fill-in 14.904929 Number of operations in full-rank: LU 5.50 MFlops Prediction: Model AMD 6180 MKL Time to factorize 7.121319e-04 s Time for mapping/scheduling 5.615268e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.384466e+01 s -- BCSC MatVec Test -- Time to initialize internal csc 9.854649e-03 s -- BCSC MatVec Test -- Case Sequential - Double - General - NoTrans: -- BCSC MatVec Test -- -- BCSC MatVec Test -- [ 0] ||A||_inf = 5.350392e+05, ||x||_inf = -5.000000e+00, ||y||_inf = 8.543869e+270 [ 0] ||spm(a*A*x+b*y)||_inf = 8.543869e+270, ||bcsc(a*A*x+b*y)||_inf = 5.249351e+04, ||R||_m = 7.275958e-12 [ 3] ||A||_inf = 5.350392e+05, ||x||_inf = -5.000000e+00, ||y||_inf = -5.000000e+00 [ 2] ||A||_inf = 5.350392e+05, ||x||_inf = 6.666667e+04, ||y||_inf = -5.000000e+00 [ 1] ||A||_inf = 5.350392e+05, ||x||_inf = 4.984399e-01, ||y||_inf = -5.000000e+00 [ 2] ||spm(a*A*x+b*y)||_inf = -5.000000e+00, ||bcsc(a*A*x+b*y)||_inf = 5.249351e+04, ||R||_m = 7.275958e-12 [ 3] ||spm(a*A*x+b*y)||_inf = -5.000000e+00, ||bcsc(a*A*x+b*y)||_inf = 5.249351e+04, ||R||_m = 7.275958e-12 [ 1] ||spm(a*A*x+b*y)||_inf = -5.000000e+00, ||bcsc(a*A*x+b*y)||_inf = 5.249351e+04, ||R||_m = 7.275958e-12 Case Sequential - Double - General - NoTrans: SUCCESS Case Sequential - Double - General - Trans: [ 1] ||A||_inf = 5.682954e+05, ||x||_inf = -5.000000e+00, ||y||_inf = -5.000000e+00 [ 3] ||A||_inf = 5.682954e+05, ||x||_inf = -5.000000e+00, ||y||_inf = -5.000000e+00 [ 3] ||spm(a*A*x+b*y)||_inf = -5.000000e+00, ||bcsc(a*A*x+b*y)||_inf = 5.045035e+04, ||R||_m = 7.275958e-12 [ 0] ||A||_inf = 5.682954e+05, ||x||_inf = -5.000000e+00, ||y||_inf = -5.000000e+00 [ 0] ||spm(a*A*x+b*y)||_inf = -5.000000e+00, ||bcsc(a*A*x+b*y)||_inf = 5.045035e+04, ||R||_m = 7.275958e-12 Case Sequential - Double - General - Trans: SUCCESS Case Static - Double - General - NoTrans: [ 2] ||A||_inf = 5.682954e+05, ||x||_inf = -5.000000e+00, ||y||_inf = -5.000000e+00 [ 2] ||spm(a*A*x+b*y)||_inf = -5.000000e+00, ||bcsc(a*A*x+b*y)||_inf = 5.045035e+04, ||R||_m = 7.275958e-12 [ 1] ||spm(a*A*x+b*y)||_inf = -5.000000e+00, ||bcsc(a*A*x+b*y)||_inf = 5.045035e+04, ||R||_m = 7.275958e-12 [ 3] ||A||_inf = 5.350392e+05, ||x||_inf = -5.000000e+00, ||y||_inf = -5.000000e+00 [ 3] ||spm(a*A*x+b*y)||_inf = -5.000000e+00, ||bcsc(a*A*x+b*y)||_inf = 5.249351e+04, ||R||_m = 7.275958e-12 [ 2] ||A||_inf = 5.350392e+05, ||x||_inf = -5.000000e+00, ||y||_inf = -5.000000e+00 [ 2] ||spm(a*A*x+b*y)||_inf = -5.000000e+00, ||bcsc(a*A*x+b*y)||_inf = 5.249351e+04, ||R||_m = 7.275958e-12 [ 1] ||A||_inf = 5.350392e+05, ||x||_inf = -5.000000e+00, ||y||_inf = -5.000000e+00 [ 1] ||spm(a*A*x+b*y)||_inf = -5.000000e+00, ||bcsc(a*A*x+b*y)||_inf = 5.249351e+04, ||R||_m = 7.275958e-12 [ 0] ||A||_inf = 5.350392e+05, ||x||_inf = 6.450476e+245, ||y||_inf = -5.000000e+00 [ 0] ||spm(a*A*x+b*y)||_inf = -5.000000e+00, ||bcsc(a*A*x+b*y)||_inf = 5.249351e+04, ||R||_m = 7.275958e-12 Case Static - Double - General - NoTrans: SUCCESS Case Static - Double - General - Trans: [ 1] ||A||_inf = 5.682954e+05, ||x||_inf = -5.000000e+00, ||y||_inf = -5.000000e+00 [ 1] ||spm(a*A*x+b*y)||_inf = -5.000000e+00, ||bcsc(a*A*x+b*y)||_inf = 5.045035e+04, ||R||_m = 7.275958e-12 [ 0] ||A||_inf = 5.682954e+05, ||x||_inf = 6.450476e+245, ||y||_inf = -5.000000e+00 [ 0] ||spm(a*A*x+b*y)||_inf = -5.000000e+00, ||bcsc(a*A*x+b*y)||_inf = 5.045035e+04, ||R||_m = 7.275958e-12 [ 2] ||A||_inf = 5.682954e+05, ||x||_inf = -5.000000e+00, ||y||_inf = 8.543869e+270 [ 2] ||spm(a*A*x+b*y)||_inf = 8.543869e+270, ||bcsc(a*A*x+b*y)||_inf = 5.045035e+04, ||R||_m = 7.275958e-12 [ 3] ||A||_inf = 5.682954e+05, ||x||_inf = 2.666667e+05, ||y||_inf = 2.666667e+05 [ 3] ||spm(a*A*x+b*y)||_inf = 1.127386e+251, ||bcsc(a*A*x+b*y)||_inf = 5.045035e+04, ||R||_m = 7.275958e-12 Case Static - Double - General - Trans: SUCCESS -- All tests PASSED -- Test #3609: fortran_shm_fsimple .....................................................***Timeout 167.49 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.947579e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.538417e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.955656e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 9.98 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 6.191593e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.233456e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 1.059822e+00 s Time to initialize coeftab 1.381391e+00 s Time to factorize 1.840552e+01 s (555.50 KFlop/s) Number of operations 2.04 MFlops Number of static pivots 0 Memory usage of coeftab 1.25 Mo Time to solve 6.673779e+00 s Time for refinement 1.209467e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.176416e-02 max(|| x_i ||_oo) 4.996108e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.882099e-16 max(|| b_i - A x_i ||_1) 1.898306e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.427302e-03 (SUCCESS) max(|| x_i ||_oo) 4.996108e-01 max(|| x0_i - x_i ||_oo) 1.221245e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 2.444393e-03 (SUCCESS) Test #3611: fortran_shm_flaplacian ..................................................***Timeout 167.47 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.061463e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.112075e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.528884e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.465112e-03 s Time for mapping/scheduling 7.665560e-01 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.341056e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.149096e-02 s Time to initialize coeftab 9.631204e-02 s Time to factorize 2.692375e+01 s ( 1.48 MFlop/s) Number of operations 8.15 MFlops Number of static pivots 0 Memory usage of coeftab 2.49 Mo Time to solve 1.263962e+01 s Time for refinement 1.307284e+01 s || A ||_1 5.648528e+01 max(|| b_i ||_oo) 3.432652e+01 max(|| x_i ||_oo) 7.029080e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.562080e-16 max(|| b_i - A x_i ||_1) 2.070315e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.784213e+00 (SUCCESS) max(|| x_i ||_oo) 7.029080e-01 max(|| x0_i - x_i ||_oo) 1.346360e-15 max(|| x0_i - x_i ||_oo / || x0_i ||_oo) 1.973480e-03 (SUCCESS) Test #3617: fortran_shm_fusermat_csr ................................................***Timeout 167.46 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 1 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy No compression Matrix type: Symmetric Arithmetic: Double Format: CSC N: 64 nnz: 208 WARNING: Refinement works only with 1 rhs, We will iterate on each RHS one by one +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 7.337374e-01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 1198 Fill-in of L 5.759615 Time to compute symbol matrix 9.260856e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.663849e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 2396 Fill-in 11.519231 Number of operations in full-rank: LU 55.06 KFlops Prediction: Model AMD 6180 MKL Time to factorize 1.354808e-04 s Time for mapping/scheduling 5.268254e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.384968e+00 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 2.106384e-03 s Time to initialize coeftab 3.488594e-01 s Time to factorize 2.567560e+00 s (21.44 KFlop/s) Number of operations 57.79 KFlops Number of static pivots 0 Memory usage of coeftab 30.3 Ko Time to solve 4.564365e+00 s Time for refinement 2.762893e+00 s || A ||_1 1.718006e-01 max(|| b_i ||_oo) 4.964705e-01 max(|| x_i ||_oo) 6.188475e+00 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.921819e-16 max(|| b_i - A x_i ||_1) 1.415204e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 3.368906e-03 (SUCCESS) Test #3588: bcsc_mpi_rep_test_bvec_gemv_tests ....................................... Passed 154.35 sec Test #3578: bcsc_mpi_rep_test_bcsc_spmv_tests_hb .................................... Passed 154.36 sec Test #3625: python_shm_simple_obj ................................................... Passed 154.32 sec Test #3621: python_shm_simple ....................................................... Passed 154.34 sec Test #3450: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrrtend .................***Timeout 200.86 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.361043e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.349878e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.130340e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.066507e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.846404e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.466830e+00 s Time to initialize coeftab 8.197373e-01 s Test #3464: mpi_dst_example_simple_lap_z_facto2_sched4_not_pqrcpend .................***Timeout 195.44 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.824837e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.138630e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.025942e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.277196e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.305445e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.732989e+00 s Time to initialize coeftab 1.085608e+00 s Test #3470: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrcpend .................***Timeout 194.30 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.051522e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.285875e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.196934e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.991298e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.042079e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 3.500038e+00 s Time to initialize coeftab 6.600638e-01 s Test #3491: mpi_dst_example_simple_lap_z_facto3_sched4_kway_svdbegin ................***Timeout 186.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 4.036062e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.540747e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.531661e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.131436e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 6.169525e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.984241e+00 s Time to initialize coeftab 8.373617e+00 s Test #3498: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpend ................***Timeout 184.58 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.283181e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.111323e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.890344e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.294313e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.853443e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.759205e+00 s Time to initialize coeftab 1.552994e+00 s Test #3478: mpi_dst_example_simple_lap_z_facto2_sched4_kway_tqrcpend ................ Passed 168.34 sec Test #3467: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpbegin ... Passed 183.66 sec Test #3508: mpi_dst_example_simple_lap_z_facto3_sched4_not_tqrcpend ................. Passed 178.07 sec Test #3496: mpi_dst_example_simple_lap_z_facto3_sched4_not_pqrcpend ................. Passed 182.30 sec Test #3480: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpend ..... Passed 188.33 sec Test #3463: mpi_dst_example_simple_lap_z_facto2_sched4_not_pqrcpbegin ............... Passed 189.56 sec Test #3500: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpend ..... Passed 184.99 sec Test #3452: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrrtend ................ Passed 195.81 sec Test #3502: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrcpend ................. Passed 191.54 sec Test #3443: mpi_dst_example_simple_lap_z_facto1_sched4_not_tqrcpbegin ...............***Timeout 215.06 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.553014e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.986832e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.132722e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.319363e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.810502e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.777077e+00 s Time to initialize coeftab 1.065068e+01 s Test #3444: mpi_dst_example_simple_lap_z_facto1_sched4_not_tqrcpend .................***Timeout 215.05 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.993365e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.530644e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.042601e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.198040e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.603737e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.406119e+00 s Time to initialize coeftab 1.004014e+00 s Test #3445: mpi_dst_example_simple_lap_z_facto1_sched4_kway_tqrcpbegin ..............***Timeout 213.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.061900e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.033295e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.069832e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.192552e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.949279e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.116070e+00 s Time to initialize coeftab 8.559792e+00 s Test #3446: mpi_dst_example_simple_lap_z_facto1_sched4_kway_tqrcpend ................***Timeout 212.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.972441e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.222528e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.483078e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.506140e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.183102e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 9.847846e+00 s Time to initialize coeftab 1.078677e+00 s Time to factorize 5.910308e+01 s (369.17 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 2.362428e+00 s - iteration 1 : total iteration time 4.38 s error 1.7599e-16 Time for refinement 6.732066e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 Test #3447: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpbegin ...***Timeout 212.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.798927e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.419787e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.441775e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.082320e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.530912e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 6.062103e+00 s Time to initialize coeftab 1.017274e+01 s Test #3449: mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrrtbegin ...............***Timeout 212.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.365495e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.870693e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.923414e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.393215e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.383793e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.964148e+00 s Time to initialize coeftab 9.040708e+00 s Test #3451: mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrrtbegin ..............***Timeout 212.31 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.894325e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.835641e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.849334e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.610442e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.257241e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.265441e+00 s Time to initialize coeftab 9.392072e+00 s Test #3453: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtbegin ...***Timeout 212.28 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.356457e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.613400e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.243077e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.760165e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.367658e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 5.278330e+00 s Time to initialize coeftab 8.131417e+00 s Test #3454: mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtend .....***Timeout 212.27 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.578388e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.896043e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.883755e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.169857e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.429424e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 4.689648e+00 s Time to initialize coeftab 5.288136e-01 s Time to factorize 7.792190e+01 s (280.02 KFlop/s) Number of operations 106.83 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 2.514384e+00 s Test #3455: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpilu0 ...............***Timeout 212.26 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.766789e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.093235e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.011483e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.138700e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.563141e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.826008e+00 s Time to initialize coeftab 8.649979e-01 s Test #3456: mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpilu1 ...............***Timeout 212.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.042292e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.726885e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.269129e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LDL^t 21.31 MFlops Prediction: Model AMD 6180 MKL Time to factorize 6.932254e-04 s Time for mapping/scheduling 1.581343e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.136860e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LDL^t Time to initialize internal csc 7.438232e+00 s Time to initialize coeftab 9.028734e-01 s Test #3457: mpi_dst_example_simple_lap_z_facto2_sched4_not_svdbegin .................***Timeout 212.23 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.087562e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.135177e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.085920e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.282401e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.550188e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.986265e+00 s Time to initialize coeftab 1.092601e+01 s Test #3458: mpi_dst_example_simple_lap_z_facto2_sched4_not_svdend ...................***Timeout 212.21 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.308936e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 9.221833e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.544059e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.185290e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.589939e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.332110e+00 s Time to initialize coeftab 1.086179e+00 s Time to factorize 7.298274e+01 s (560.81 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 4.438829e+00 s Test #3459: mpi_dst_example_simple_lap_z_facto2_sched4_kway_svdbegin ................***Timeout 212.20 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.063549e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.585371e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.846992e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.008787e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.632477e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.918846e+00 s Time to initialize coeftab 9.037788e+00 s Test #3460: mpi_dst_example_simple_lap_z_facto2_sched4_kway_svdend ..................***Timeout 212.17 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.845438e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.617496e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.596661e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.010215e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.979162e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.071005e+00 s Time to initialize coeftab 1.534569e+00 s Time to factorize 6.552279e+01 s (624.66 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 3.077702e+00 s Test #3461: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_svdbegin .....***Timeout 212.14 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.875766e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.970218e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.849122e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.800276e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.336063e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.771323e+00 s Time to initialize coeftab 1.106023e+01 s Test #3462: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_svdend .......***Timeout 212.12 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.904188e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.711694e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 7.648553e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 9.697745e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.324863e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.195860e+00 s Time to initialize coeftab 6.906590e-01 s Time to factorize 7.049577e+01 s (580.59 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Test #3465: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpbegin ..............***Timeout 212.10 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.767209e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.757237e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.114812e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.104401e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.756572e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.859738e+00 s Time to initialize coeftab 4.036026e+00 s Time to factorize 8.668762e+01 s (472.15 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 2.177771e+00 s Test #3466: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpend ................***Timeout 212.08 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.422573e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.354172e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.604893e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.010699e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 5.289015e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.994100e+00 s Time to initialize coeftab 1.041627e+00 s Time to factorize 5.414759e+01 s (755.88 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 4.070872e+00 s - iteration 1 : total iteration time 6.07 s error 2.4404e-15 Time for refinement 1.186776e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.451782e-15 max(|| b_i - A x_i ||_1) 2.495713e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.297531e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.451782e-15 max(|| b_i - A x_i ||_1) 2.495713e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.297531e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.451782e-15 max(|| b_i - A x_i ||_1) 2.495713e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.297531e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.451782e-15 max(|| b_i - A x_i ||_1) 2.495713e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 6.297531e-03 (SUCCESS) Test #3468: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpend .....***Timeout 212.02 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.280489e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.550336e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.177510e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 8.024062e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.854454e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 8.126114e+00 s Time to initialize coeftab 1.052074e+00 s Time to factorize 3.598979e+01 s ( 1.11 MFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 3.729030e+00 s - iteration 1 : total iteration time 4.86 s error 4.4395e-16 Time for refinement 1.089977e+01 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.600611e-16 max(|| b_i - A x_i ||_1) 9.483034e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.392891e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.600611e-16 max(|| b_i - A x_i ||_1) 9.483034e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.392891e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.600611e-16 max(|| b_i - A x_i ||_1) 9.483034e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.392891e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 4.600611e-16 max(|| b_i - A x_i ||_1) 9.483034e-17 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 2.392891e-03 (SUCCESS) Test #3469: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrcpbegin ...............***Timeout 211.98 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.237481e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.290745e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.611491e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.180283e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.843788e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.146957e-01 s Time to initialize coeftab 1.122960e+01 s Test #3471: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrcpbegin ..............***Timeout 211.91 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.963703e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.405461e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.115593e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.371588e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.802728e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.727970e+00 s Time to initialize coeftab 9.862373e+00 s Test #3472: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrcpend ................***Timeout 211.84 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.604078e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.630009e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.548593e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.242897e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.605927e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.488870e+00 s Time to initialize coeftab 1.301638e+00 s Time to factorize 7.421603e+01 s (551.49 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 1.133407e+01 s - iteration 1 : total iteration time 3.77 s error 2.5207e-15 Time for refinement 8.026623e+00 s Test #3473: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpbegin ...***Timeout 211.77 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.981717e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.410364e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.395255e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.046659e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.567605e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 7.423842e-01 s Time to initialize coeftab 1.025807e+01 s Test #3474: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpend .....***Timeout 211.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.363465e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.339245e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.484525e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.133130e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.886446e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.214461e+00 s Time to initialize coeftab 1.415998e+00 s Time to factorize 7.321706e+01 s (559.01 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 3.292687e+00 s - iteration 1 : total iteration time 9.6 s error 3.8135e-15 Time for refinement 1.369769e+01 s Test #3475: mpi_dst_example_simple_lap_z_facto2_sched4_not_tqrcpbegin ...............***Timeout 211.61 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.124790e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.405086e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.863277e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.495547e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.836262e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.578077e+00 s Time to initialize coeftab 1.116883e+01 s Test #3476: mpi_dst_example_simple_lap_z_facto2_sched4_not_tqrcpend .................***Timeout 211.51 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.059751e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.003031e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.397955e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 5.387075e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 2.674196e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.846611e+00 s Time to initialize coeftab 1.249879e+00 s Time to factorize 9.644608e+01 s (424.38 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 2.428507e+00 s Test #3477: mpi_dst_example_simple_lap_z_facto2_sched4_kway_tqrcpbegin ..............***Timeout 211.40 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.134130e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 8.509925e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.010896e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.588171e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.905186e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.820279e+00 s Time to initialize coeftab 1.361737e+01 s Test #3479: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpbegin ...***Timeout 211.24 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.612961e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.907318e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.588543e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.283664e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.462231e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.270134e+00 s Time to initialize coeftab 1.229746e+01 s Test #3481: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrrtbegin ...............***Timeout 210.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch 1: 300 1140 2: 200 760 3: 200 660 Time to compute ordering 2.462472e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.394291e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.117884e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.124954e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.071471e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.137357e+00 s Time to initialize coeftab 9.109984e+00 s Time to factorize 7.690263e+01 s (532.22 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Test #3482: mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrrtend .................***Timeout 208.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.397154e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.332066e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.247331e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.351630e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.640153e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.475960e+00 s Time to initialize coeftab 1.771776e+00 s Time to factorize 4.671092e+01 s (876.23 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 8.599818e+00 s - iteration 1 : total iteration time 3.41 s error 5.5348e-14 Time for refinement 7.844079e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.534595e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.534595e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.534595e-14 max(|| b_i - A x_i ||_2 / || b_i ||_2) 5.534595e-14 max(|| b_i - A x_i ||_1) 4.209670e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062242e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 4.209670e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062242e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 4.209670e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062242e-01 (SUCCESS) max(|| b_i - A x_i ||_1) 4.209670e-15 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.062242e-01 (SUCCESS) Test #3483: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrrtbegin ..............***Timeout 208.69 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.840319e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.342572e-01 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.114212e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.215767e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.261812e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.993090e+00 s Time to initialize coeftab 1.258203e+01 s Test #3484: mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrrtend ................***Timeout 208.67 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.552888e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.253790e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.306634e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.751907e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.869159e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.703189e+00 s Time to initialize coeftab 6.808770e-01 s Time to factorize 7.320943e+01 s (559.07 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 7.836189e+00 s - iteration 1 : total iteration time 3.67 s error 2.0997e-15 Time for refinement 8.842361e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.101247e-15 max(|| b_i - A x_i ||_1) 2.026808e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.114323e-03 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.101247e-15 max(|| b_i - A x_i ||_2 / || b_i ||_2) 2.101247e-15 max(|| b_i - A x_i ||_1) 2.026808e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 5.114323e-03 (SUCCESS) Test #3485: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtbegin ...***Timeout 208.66 sec ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 3.086804e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.077189e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.346498e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.165391e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.931466e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 4.680440e+00 s Time to initialize coeftab 1.478976e+01 s Time to factorize 8.823946e+01 s (463.84 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Test #3486: mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtend .....***Timeout 208.63 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRRT Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.393719e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.342055e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.489921e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.368845e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.901688e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.239955e+00 s Time to initialize coeftab 1.316323e+00 s Time to factorize 5.195603e+01 s (787.77 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 8.487441e+00 s - iteration 1 : total iteration time 4.71 s error 3.6644e-15 Time for refinement 7.764543e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670737e-15 max(|| b_i - A x_i ||_1) 4.476948e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.129686e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670737e-15 max(|| b_i - A x_i ||_1) 4.476948e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.129686e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670737e-15 max(|| b_i - A x_i ||_1) 4.476948e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.129686e-02 (SUCCESS) max(|| b_i - A x_i ||_2 / || b_i ||_2) 3.670737e-15 max(|| b_i - A x_i ||_1) 4.476948e-16 max(|| b_i - A x_i ||_1 / (||A||_1 * ||x_i||_oo * eps)) 1.129686e-02 (SUCCESS) Test #3487: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpilu0 ...............***Timeout 208.60 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.052934e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.393000e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.284564e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.502094e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.991725e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 5.207500e+00 s Time to initialize coeftab 1.372214e+00 s Time to factorize 6.441843e+01 s (635.37 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 2.380532e+00 s Test #3488: mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpilu1 ...............***Timeout 208.56 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.862596e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 6.671134e-03 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.245596e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 130184 Fill-in 35.184865 Number of operations in full-rank: LU 39.97 MFlops Prediction: Model AMD 6180 MKL Time to factorize 1.548061e-03 s Time for mapping/scheduling 1.348248e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.525001e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LU Time to initialize internal csc 6.512999e+00 s Time to initialize coeftab 1.601786e+00 s Time to factorize 6.023642e+01 s (679.48 KFlop/s) Number of operations 32.61 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 355 Ko / 355 Ko ------------------------------------------------ Total 451 Ko / 451 Ko Time to solve 4.039043e+00 s - iteration 1 : total iteration time 4.97 s error 8.1413e-15 Time for refinement 8.810951e+00 s Test #3489: mpi_dst_example_simple_lap_z_facto3_sched4_not_svdbegin .................***Timeout 208.52 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal ischedInit: The thread number has been automatically set to 256 Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 1: 300 1140 2: 200 760 3: 200 660 0: 300 1140 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.301741e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.155065e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.197141e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.469231e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.462306e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.631612e+00 s Time to initialize coeftab 7.635738e+00 s Test #3490: mpi_dst_example_simple_lap_z_facto3_sched4_not_svdend ...................***Timeout 208.48 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.818292e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 3.731713e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.200309e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.239155e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.169188e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.687561e+00 s Time to initialize coeftab 8.861690e-01 s Test #3492: mpi_dst_example_simple_lap_z_facto3_sched4_kway_svdend ..................***Timeout 208.46 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.075909e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 4.621408e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.065540e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.276434e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.443814e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.379097e+00 s Time to initialize coeftab 9.840706e-01 s Test #3493: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_svdbegin .....***Timeout 208.45 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.522161e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.043628e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.149262e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.177072e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.270521e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.010606e+00 s Time to initialize coeftab 6.469143e+00 s Test #3494: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_svdend .......***Timeout 208.43 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method SVD Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 3: 200 660 2: 200 760 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.024130e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.831648e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.107609e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.352430e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.016620e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.760278e+00 s Time to initialize coeftab 1.237456e+00 s Time to factorize 6.269557e+01 s (331.25 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 5.119178e+00 s Test #3495: mpi_dst_example_simple_lap_z_facto3_sched4_not_pqrcpbegin ...............***Timeout 208.41 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.788112e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.426716e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.212614e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 8.925829e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.354677e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.936308e+00 s Time to initialize coeftab 3.707700e+00 s Test #3497: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpbegin ..............***Timeout 207.36 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.078911e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.963759e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.788596e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.213639e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.381323e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.113923e+00 s Time to initialize coeftab 3.910248e+00 s Test #3499: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpbegin ...***Timeout 207.34 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method PQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.381797e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.576028e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 8.211306e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.253473e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.720621e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.326237e+00 s Time to initialize coeftab 3.278084e+00 s Time to factorize 5.574028e+01 s (372.58 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 4.549195e+00 s Test #3501: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrcpbegin ...............***Timeout 207.32 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.623398e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.335976e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.269680e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.468023e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.821587e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.522550e+00 s Time to initialize coeftab 7.721014e+00 s Test #3503: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrcpbegin ..............***Timeout 207.13 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.881711e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.292357e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 2.385537e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.083079e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.394190e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 4.849876e+00 s Time to initialize coeftab 7.753908e+00 s Test #3504: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrcpend ................***Timeout 206.15 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 3: 200 660 2: 200 760 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.933639e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.502600e-02 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 3.535886e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.600115e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.583447e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.041107e+00 s Time to initialize coeftab 9.419474e-01 s Time to factorize 5.671425e+01 s (366.18 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 2.904681e+00 s - iteration 1 : total iteration time 6.89 s error 7.5931e-16 Time for refinement 1.232388e+01 s Test #3505: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpbegin ...***Timeout 205.55 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.176859e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.471619e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 9.584827e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 9.661913e+00 s +-------------------------------------------------+ Analyze task: Total time for analyze 3.578988e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.793235e+00 s Time to initialize coeftab 7.271074e+00 s Test #3506: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpend .....***Timeout 204.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Just-In-Time Tolerance 1e-08 Compress method RQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY and projections Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.519590e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.202153e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.168274e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.124166e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.603981e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 5.025145e+00 s Time to initialize coeftab 7.742727e-01 s Time to factorize 6.141367e+01 s (338.16 KFlop/s) Number of operations 102.73 MFlops Number of static pivots 0 Compression: ------------------------------------------------ Full-rank supernodes Inside 0 o Outside 0 o Low-rank supernodes Diag in diag 96.7 Ko Inside not selected 0 o / 0 o Inside selected 0 o / 0 o Outside 177 Ko / 177 Ko ------------------------------------------------ Total 274 Ko / 274 Ko Time to solve 3.617339e+00 s - iteration 1 : total iteration time 4.49 s error 3.6915e-16 Time for refinement 9.812120e+00 s || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 || A ||_1 5.112481e-02 max(|| b_i ||_oo) 2.468962e-02 max(|| x_i ||_oo) 6.822263e-01 Test #3507: mpi_dst_example_simple_lap_z_facto3_sched4_not_tqrcpbegin ...............***Timeout 204.71 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy Not used Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 2.343311e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 2.999160e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 4.584782e-01 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.262317e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.742098e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 6.882675e+00 s Time to initialize coeftab 1.064179e+01 s Test #3509: mpi_dst_example_simple_lap_z_facto3_sched4_kway_tqrcpbegin ..............***Timeout 204.66 sec ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 ischedInit: The thread number has been automatically set to 256 +-------------------------------------------------+ + PaStiX : Parallel Sparse matriX package + +-------------------------------------------------+ Version: 6.4.0 Schedulers: sequential: Enabled thread static: Started thread dynamic: Started PaRSEC: Disabled StarPU: Disabled Number of MPI processes: 4 Number of threads per process: 256 Number of GPUs: 0 MPI communication support: PastixMpiThreadMultiple Distribution level: 2D( 160) Blocking size (min/max): 160 / 320 Computational models CPU: AMD Opteron 6180 - Intel MKL GPU: Nvidia K40 GK1108L - CUDA 8.0 Low rank parameters: Strategy Memory Optimal Tolerance 1e-08 Compress method TQRCP Compress minimal width 16 Compress minimal height 16 Compress min ratio 1.000000 Tolerance criterion per block Absolute Orthogonalization method CGS Splitting Strategy KWAY Levels of projections 0 Levels of kway 2147483647 Projections distance 3 Projections depth 3 Projections width 1 Matrix type: Hermitian Arithmetic: Complex64 Format: CSC N: 1000 nnz: 3700 Details: N nnz 0: 300 1140 1: 300 1140 2: 200 760 3: 200 660 +-------------------------------------------------+ Ordering subtask : Ordering method is: Scotch Time to compute ordering 1.972431e+01 s +-------------------------------------------------+ Symbolic factorization subtask: Symbol factorization using: Fax Direct Number of nonzeroes in L structure 65092 Fill-in of L 17.592432 Time to compute symbol matrix 1.421319e+00 s +-------------------------------------------------+ Reordering subtask: Split level 0 Stoping criterion -1 Time for reordering 1.174492e+00 s +-------------------------------------------------+ Mapping/Scheduling subtask: Number of non-zeroes in blocked L 65092 Fill-in 17.592432 Number of operations in full-rank: LL^t 20.28 MFlops Prediction: Model AMD 6180 MKL Time to factorize 5.477282e-04 s Time for mapping/scheduling 1.786934e+01 s +-------------------------------------------------+ Analyze task: Total time for analyze 4.409802e+01 s +-------------------------------------------------+ Factorization task: Factorization used: LL^t Time to initialize internal csc 7.306599e+00 s Time to initialize coeftab 8.412805e+00 s Test #3574: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_c ................................. Passed 55.09 sec Test #3579: bcsc_mpi_rep_test_bcsc_spmv_tests_mm2 ................................... Passed 55.31 sec Test #3573: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_d ................................. Passed 59.07 sec Test #3586: bcsc_mpi_rep_test_bcsc_spmv_time_hb ..................................... Passed 59.35 sec Test #3596: bcsc_mpi_dst_test_bcsc_spmv_tests_mm .................................... Passed 59.77 sec Test #3592: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_d ................................. Passed 65.95 sec Test #3595: bcsc_mpi_dst_test_bcsc_spmv_tests_rsa ................................... Passed 67.01 sec Test #3626: python_mpi_simple_obj ................................................... Passed 65.26 sec Test #3594: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_z ................................. Passed 68.84 sec Test #3601: bcsc_mpi_dst_test_bcsc_spmv_time_lap_c .................................. Passed 69.45 sec Test #3587: bcsc_mpi_rep_test_bcsc_spmv_time_mm2 .................................... Passed 70.89 sec Test #3608: bcsc_mpi_dst_test_bvec_applyorder_tests ................................. Passed 69.10 sec Test #3618: fortran_mpi_fusermat_csr ................................................ Passed 69.85 sec Test #3605: bcsc_mpi_dst_test_bcsc_spmv_time_hb ..................................... Passed 72.14 sec Test #3540: mpi_dst_example_simple_lap_z_facto4_sched4_not_tqrcpend ................. Passed 80.29 sec Test #3589: bcsc_mpi_rep_test_bvec_tests ............................................ Passed 82.87 sec Test #3576: bcsc_mpi_rep_test_bcsc_spmv_tests_rsa ................................... Passed 86.16 sec Test #3516: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrrtend ................ Passed 89.42 sec Test #3591: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_s ................................. Passed 88.32 sec Test #3518: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtend ..... Passed 91.44 sec Test #3577: bcsc_mpi_rep_test_bcsc_spmv_tests_mm .................................... Passed 92.50 sec Test #3603: bcsc_mpi_dst_test_bcsc_spmv_time_rsa .................................... Passed 91.58 sec Test #3599: bcsc_mpi_dst_test_bcsc_spmv_time_lap_s .................................. Passed 93.43 sec Test #3593: bcsc_mpi_dst_test_bcsc_spmv_tests_lap_c ................................. Passed 95.60 sec Test #3604: bcsc_mpi_dst_test_bcsc_spmv_time_mm ..................................... Passed 94.48 sec Test #3572: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_s ................................. Passed 96.50 sec Test #3606: bcsc_mpi_dst_test_bcsc_spmv_time_mm2 .................................... Passed 95.15 sec Test #3583: bcsc_mpi_rep_test_bcsc_spmv_time_lap_z .................................. Passed 97.04 sec Test #3620: fortran_shm_fmultilap_mt ................................................ Passed 94.81 sec Test #3607: bcsc_mpi_dst_test_bvec_tests ............................................ Passed 95.52 sec Test #3585: bcsc_mpi_rep_test_bcsc_spmv_time_mm ..................................... Passed 97.90 sec Test #3575: bcsc_mpi_rep_test_bcsc_spmv_tests_lap_z ................................. Passed 98.15 sec Test #3528: mpi_dst_example_simple_lap_z_facto4_sched4_not_pqrcpend ................. Passed 103.11 sec Test #3581: bcsc_mpi_rep_test_bcsc_spmv_time_lap_d .................................. Passed 102.86 sec Test #3598: bcsc_mpi_dst_test_bcsc_spmv_tests_mm2 ................................... Passed 101.80 sec Test #3584: bcsc_mpi_rep_test_bcsc_spmv_time_rsa .................................... Passed 103.88 sec Test #3510: mpi_dst_example_simple_lap_z_facto3_sched4_kway_tqrcpend ................ Passed 105.28 sec Test #3519: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpilu0 ............... Passed 104.98 sec Test #3622: python_mpi_simple ....................................................... Passed 101.32 sec Test #3582: bcsc_mpi_rep_test_bcsc_spmv_time_lap_c .................................. Passed 104.98 sec Test #3600: bcsc_mpi_dst_test_bcsc_spmv_time_lap_d .................................. Passed 104.21 sec Test #3610: fortran_mpi_fsimple ..................................................... Passed 104.59 sec Test #3602: bcsc_mpi_dst_test_bcsc_spmv_time_lap_z .................................. Passed 105.59 sec Test #3538: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpend ..... Passed 107.92 sec Test #3542: mpi_dst_example_simple_lap_z_facto4_sched4_kway_tqrcpend ................ Passed 110.47 sec Test #3546: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrrtend ................. Passed 111.77 sec Test #3623: python_shm_step-by-step ................................................. Passed 108.08 sec Test #3530: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpend ................ Passed 113.43 sec Test #3550: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtend ..... Passed 113.16 sec Test #3548: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrrtend ................ Passed 114.36 sec Test #3612: fortran_mpi_flaplacian .................................................. Passed 111.50 sec Test #3526: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_svdend ....... Passed 116.55 sec Test #3512: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpend ..... Passed 117.31 sec Test #3514: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrrtend ................. Passed 117.56 sec Test #3544: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpend ..... Passed 116.70 sec Test #3580: bcsc_mpi_rep_test_bcsc_spmv_time_lap_s .................................. Passed 116.91 sec Test #3517: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtbegin ... Passed 117.90 sec Test #3532: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpend ..... Passed 118.01 sec Test #3513: mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrrtbegin ............... Passed 119.64 sec Test #3547: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrrtbegin .............. Passed 118.68 sec Test #3551: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpilu0 ............... Passed 118.77 sec Test #3613: fortran_shm_fstep-by-step ............................................... Passed 115.35 sec Test #3511: mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_tqrcpbegin ... Passed 120.44 sec Test #3534: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrcpend ................. Passed 119.77 sec Test #3535: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrcpbegin .............. Passed 120.50 sec Test #3529: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpbegin .............. Passed 120.79 sec Test #3533: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrcpbegin ............... Passed 120.77 sec Test #3536: mpi_dst_example_simple_lap_z_facto4_sched4_kway_rqrcpend ................ Passed 121.13 sec Test #3531: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_pqrcpbegin ... Passed 121.17 sec Test #3524: mpi_dst_example_simple_lap_z_facto4_sched4_kway_svdend .................. Passed 121.31 sec Test #3545: mpi_dst_example_simple_lap_z_facto4_sched4_not_rqrrtbegin ............... Passed 121.98 sec Test #3543: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_tqrcpbegin ... Passed 122.13 sec Test #3619: fortran_shm_fmultilap_seq ............................................... Passed 119.85 sec Test #3520: mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpilu1 ............... Passed 123.56 sec Test #3522: mpi_dst_example_simple_lap_z_facto4_sched4_not_svdend ................... Passed 123.72 sec Test #3527: mpi_dst_example_simple_lap_z_facto4_sched4_not_pqrcpbegin ............... Passed 126.53 sec Test #3624: python_mpi_step-by-step ................................................. Passed 125.79 sec Test #3525: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_svdbegin ..... Passed 130.48 sec Test #3537: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrcpbegin ... Passed 131.10 sec Test #3552: mpi_dst_example_simple_lap_z_facto4_sched4_kway_pqrcpilu1 ............... Passed 130.91 sec Test #3539: mpi_dst_example_simple_lap_z_facto4_sched4_not_tqrcpbegin ............... Passed 132.29 sec Test #3515: mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrrtbegin .............. Passed 133.35 sec Test #3549: mpi_dst_example_simple_lap_z_facto4_sched4_kwayprojections_rqrrtbegin ... Passed 133.26 sec Test #3541: mpi_dst_example_simple_lap_z_facto4_sched4_kway_tqrcpbegin .............. Passed 133.83 sec Test #3616: fortran_mpi_fmultidof ................................................... Passed 130.73 sec Test #3523: mpi_dst_example_simple_lap_z_facto4_sched4_kway_svdbegin ................ Passed 135.08 sec Test #3521: mpi_dst_example_simple_lap_z_facto4_sched4_not_svdbegin ................. Passed 135.12 sec Test #3615: fortran_shm_fmultidof ................................................... Passed 130.87 sec Test #3614: fortran_mpi_fstep-by-step ............................................... Passed 132.05 sec 61% tests passed, 1409 tests failed out of 3626 Total Test time (real) = 5152.52 sec The following tests FAILED: 47 - c_shm_example_simple_solve_and_refine_lap_z_facto3 (Timeout) 49 - c_shm_example_simple_trans_lap_s_facto0 (Timeout) 62 - c_shm_example_simple_trans_lap_z_facto2 (Timeout) 68 - c_shm_example_step-by-step_lap_d_facto0 (Timeout) 69 - c_shm_example_step-by-step_lap_d_facto1 (Timeout) 70 - c_shm_example_step-by-step_lap_d_facto2 (Timeout) 71 - c_shm_example_step-by-step_lap_c_facto0 (Timeout) 72 - c_shm_example_step-by-step_lap_c_facto1 (Timeout) 73 - c_shm_example_step-by-step_lap_c_facto2 (Timeout) 74 - c_shm_example_step-by-step_lap_c_facto3 (Timeout) 76 - c_shm_example_step-by-step_lap_z_facto0 (Timeout) 77 - c_shm_example_step-by-step_lap_z_facto1 (Timeout) 78 - c_shm_example_step-by-step_lap_z_facto2 (Timeout) 79 - c_shm_example_step-by-step_lap_z_facto3 (Timeout) 80 - c_shm_example_step-by-step_lap_z_facto4 (Timeout) 89 - c_shm_example_personal_lap_s_facto0 (Timeout) 90 - c_shm_example_personal_lap_s_facto1 (Timeout) 91 - c_shm_example_personal_lap_s_facto2 (Timeout) 92 - c_shm_example_personal_lap_d_facto0 (Timeout) 93 - c_shm_example_personal_lap_d_facto1 (Timeout) 95 - c_shm_example_personal_lap_c_facto0 (Timeout) 96 - c_shm_example_personal_lap_c_facto1 (Timeout) 97 - c_shm_example_personal_lap_c_facto2 (Timeout) 98 - c_shm_example_personal_lap_c_facto3 (Timeout) 99 - c_shm_example_personal_lap_c_facto4 (Timeout) 100 - c_shm_example_personal_lap_z_facto0 (Timeout) 101 - c_shm_example_personal_lap_z_facto1 (Timeout) 102 - c_shm_example_personal_lap_z_facto2 (Timeout) 103 - c_shm_example_personal_lap_z_facto3 (Timeout) 104 - c_shm_example_personal_lap_z_facto4 (Timeout) 121 - c_shm_example_simple_scotch_rsa (Timeout) 125 - c_shm_example_simple_single_rsa (Timeout) 129 - c_shm_example_step-by-step_single_rsa (Timeout) 133 - c_shm_example_simple_refine_cg (Timeout) 140 - c_shm_example_refinement_lap_d_refine_gmres_sym (Timeout) 149 - c_shm_example_refinement_lap_z_refine_gmres_her (Timeout) 152 - c_shm_example_refinement_lap_z_refine_gmres_sym (Timeout) 154 - c_shm_example_simple_mixed_refine_cg (Timeout) 155 - c_shm_example_simple_mixed_refine_gmres (Timeout) 160 - c_shm_example_simple_mixed_lap_d_refine_cg_sym (Timeout) 165 - c_shm_example_simple_mixed_lap_z_facto2 (Timeout) 169 - c_shm_example_simple_mixed_lap_z_refine_gmres_her (Timeout) 201 - shm_example_simple_lap_z_facto0_sched1_1d (Timeout) 216 - shm_example_simple_lap_c_facto4_sched4_1d (Timeout) 218 - shm_example_simple_lap_z_facto1_sched4_1d (Timeout) 219 - shm_example_simple_lap_z_facto2_sched4_1d (Timeout) 252 - shm_example_simple_lap_s_facto0_sched0_not_pqrcpbegin (Timeout) 258 - shm_example_simple_lap_s_facto0_sched0_not_rqrcpbegin (Timeout) 282 - shm_example_simple_lap_s_facto1_sched0_kwayprojections_svdbegin (Timeout) 286 - shm_example_simple_lap_s_facto1_sched0_kway_pqrcpbegin (Timeout) 332 - shm_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpbegin (Timeout) 339 - shm_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtend (Timeout) 362 - shm_example_simple_lap_d_facto0_sched0_kway_tqrcpbegin (Timeout) 363 - shm_example_simple_lap_d_facto0_sched0_kway_tqrcpend (Timeout) 364 - shm_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpbegin (Timeout) 375 - shm_example_simple_lap_d_facto1_sched0_not_svdend (Timeout) 388 - shm_example_simple_lap_d_facto1_sched0_kway_rqrcpbegin (Timeout) 419 - shm_example_simple_lap_d_facto2_sched0_not_rqrcpend (Timeout) 420 - shm_example_simple_lap_d_facto2_sched0_kway_rqrcpbegin (Timeout) 426 - shm_example_simple_lap_d_facto2_sched0_kway_tqrcpbegin (Timeout) 434 - shm_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtbegin (Timeout) 436 - shm_example_simple_lap_d_facto2_sched0_kway_pqrcpilu0 (Timeout) 450 - shm_example_simple_lap_c_facto0_sched0_not_rqrcpbegin (Timeout) 452 - shm_example_simple_lap_c_facto0_sched0_kway_rqrcpbegin (Timeout) 453 - shm_example_simple_lap_c_facto0_sched0_kway_rqrcpend (Timeout) 463 - shm_example_simple_lap_c_facto0_sched0_not_rqrrtend (Timeout) 476 - shm_example_simple_lap_c_facto1_sched0_not_pqrcpbegin (Timeout) 484 - shm_example_simple_lap_c_facto1_sched0_kway_rqrcpbegin (Timeout) 490 - shm_example_simple_lap_c_facto1_sched0_kway_tqrcpbegin (Timeout) 492 - shm_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpbegin (Timeout) 493 - shm_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpend (Timeout) 494 - shm_example_simple_lap_c_facto1_sched0_not_rqrrtbegin (Timeout) 502 - shm_example_simple_lap_c_facto2_sched0_not_svdbegin (Timeout) 504 - shm_example_simple_lap_c_facto2_sched0_kway_svdbegin (Timeout) 506 - shm_example_simple_lap_c_facto2_sched0_kwayprojections_svdbegin (Timeout) 739 - shm_example_simple_lap_z_facto4_sched0_not_rqrcpend (Timeout) 742 - shm_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpbegin (Timeout) 766 - shm_example_simple_lap_s_facto0_sched1_kway_pqrcpbegin (Timeout) 787 - shm_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtend (Timeout) 791 - shm_example_simple_lap_s_facto1_sched1_not_svdend (Timeout) 793 - shm_example_simple_lap_s_facto1_sched1_kway_svdend (Timeout) 795 - shm_example_simple_lap_s_facto1_sched1_kwayprojections_svdend (Timeout) 796 - shm_example_simple_lap_s_facto1_sched1_not_pqrcpbegin (Timeout) 878 - shm_example_simple_lap_d_facto0_sched1_not_rqrrtbegin (Timeout) 1039 - shm_example_simple_lap_c_facto2_sched1_kway_rqrrtbegin (Timeout) 1041 - shm_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtbegin (Timeout) 1043 - shm_example_simple_lap_c_facto2_sched1_kway_pqrcpilu0 (Timeout) 1045 - shm_example_simple_lap_c_facto3_sched1_not_svdbegin (Timeout) 1047 - shm_example_simple_lap_c_facto3_sched1_kway_svdbegin (Timeout) 1049 - shm_example_simple_lap_c_facto3_sched1_kwayprojections_svdbegin (Timeout) 1079 - shm_example_simple_lap_c_facto4_sched1_kway_svdbegin (Timeout) 1106 - shm_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtend (Timeout) 1163 - shm_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpbegin (Timeout) 1165 - shm_example_simple_lap_z_facto1_sched1_not_rqrrtbegin (Timeout) 1178 - shm_example_simple_lap_z_facto2_sched1_kwayprojections_svdend (Timeout) 1226 - shm_example_simple_lap_z_facto3_sched1_kway_tqrcpend (Timeout) 1240 - shm_example_simple_lap_z_facto4_sched1_kway_svdend (Timeout) 1243 - shm_example_simple_lap_z_facto4_sched1_not_pqrcpbegin (Timeout) 1313 - shm_example_simple_lap_s_facto1_sched4_not_rqrcpbegin (Timeout) 1360 - shm_example_simple_lap_s_facto2_sched4_kway_rqrrtend (Timeout) 1361 - shm_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtbegin (Timeout) 1372 - shm_example_simple_lap_d_facto0_sched4_not_pqrcpend (Timeout) 1374 - shm_example_simple_lap_d_facto0_sched4_kway_pqrcpend (Timeout) 1378 - shm_example_simple_lap_d_facto0_sched4_not_rqrcpend (Timeout) 1381 - shm_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpbegin (Timeout) 1397 - shm_example_simple_lap_d_facto1_sched4_not_svdbegin (Timeout) 1427 - shm_example_simple_lap_d_facto1_sched4_kway_pqrcpilu0 (Timeout) 1437 - shm_example_simple_lap_d_facto2_sched4_kway_pqrcpbegin (Timeout) 1444 - shm_example_simple_lap_d_facto2_sched4_kway_rqrcpend (Timeout) 1453 - shm_example_simple_lap_d_facto2_sched4_not_rqrrtbegin (Timeout) 1455 - shm_example_simple_lap_d_facto2_sched4_kway_rqrrtbegin (Timeout) 1460 - shm_example_simple_lap_d_facto2_sched4_kway_pqrcpilu1 (Timeout) 1461 - shm_example_simple_lap_c_facto0_sched4_not_svdbegin (Timeout) 1463 - shm_example_simple_lap_c_facto0_sched4_kway_svdbegin (Timeout) 1465 - shm_example_simple_lap_c_facto0_sched4_kwayprojections_svdbegin (Timeout) 1471 - shm_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpbegin (Timeout) 1475 - shm_example_simple_lap_c_facto0_sched4_kway_rqrcpbegin (Timeout) 1476 - shm_example_simple_lap_c_facto0_sched4_kway_rqrcpend (Timeout) 1477 - shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpbegin (Timeout) 1479 - shm_example_simple_lap_c_facto0_sched4_not_tqrcpbegin (Timeout) 1482 - shm_example_simple_lap_c_facto0_sched4_kway_tqrcpend (Timeout) 1490 - shm_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtend (Timeout) 1491 - shm_example_simple_lap_c_facto0_sched4_kway_pqrcpilu0 (Timeout) 1495 - shm_example_simple_lap_c_facto1_sched4_kway_svdbegin (Timeout) 1497 - shm_example_simple_lap_c_facto1_sched4_kwayprojections_svdbegin (Timeout) 1500 - shm_example_simple_lap_c_facto1_sched4_not_pqrcpend (Timeout) 1511 - shm_example_simple_lap_c_facto1_sched4_not_tqrcpbegin (Timeout) 1513 - shm_example_simple_lap_c_facto1_sched4_kway_tqrcpbegin (Timeout) 1514 - shm_example_simple_lap_c_facto1_sched4_kway_tqrcpend (Timeout) 1515 - shm_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpbegin (Timeout) 1517 - shm_example_simple_lap_c_facto1_sched4_not_rqrrtbegin (Timeout) 1522 - shm_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtend (Timeout) 1523 - shm_example_simple_lap_c_facto1_sched4_kway_pqrcpilu0 (Timeout) 1525 - shm_example_simple_lap_c_facto2_sched4_not_svdbegin (Timeout) 1526 - shm_example_simple_lap_c_facto2_sched4_not_svdend (Timeout) 1528 - shm_example_simple_lap_c_facto2_sched4_kway_svdend (Timeout) 1533 - shm_example_simple_lap_c_facto2_sched4_kway_pqrcpbegin (Timeout) 1541 - shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpbegin (Timeout) 1543 - shm_example_simple_lap_c_facto2_sched4_not_tqrcpbegin (Timeout) 1549 - shm_example_simple_lap_c_facto2_sched4_not_rqrrtbegin (Timeout) 1553 - shm_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtbegin (Timeout) 1566 - shm_example_simple_lap_c_facto3_sched4_kway_pqrcpend (Timeout) 1571 - shm_example_simple_lap_c_facto3_sched4_kway_rqrcpbegin (Timeout) 1580 - shm_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpend (Timeout) 1582 - shm_example_simple_lap_c_facto3_sched4_not_rqrrtend (Timeout) 1585 - shm_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtbegin (Timeout) 1591 - shm_example_simple_lap_c_facto4_sched4_kway_svdbegin (Timeout) 1594 - shm_example_simple_lap_c_facto4_sched4_kwayprojections_svdend (Timeout) 1636 - shm_example_simple_lap_z_facto0_sched4_kway_rqrcpend (Timeout) 1646 - shm_example_simple_lap_z_facto0_sched4_not_rqrrtend (Timeout) 1649 - shm_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtbegin (Timeout) 1657 - shm_example_simple_lap_z_facto1_sched4_kwayprojections_svdbegin (Timeout) 1658 - shm_example_simple_lap_z_facto1_sched4_kwayprojections_svdend (Timeout) 1659 - shm_example_simple_lap_z_facto1_sched4_not_pqrcpbegin (Timeout) 1660 - shm_example_simple_lap_z_facto1_sched4_not_pqrcpend (Timeout) 1661 - shm_example_simple_lap_z_facto1_sched4_kway_pqrcpbegin (Timeout) 1665 - shm_example_simple_lap_z_facto1_sched4_not_rqrcpbegin (Timeout) 1668 - shm_example_simple_lap_z_facto1_sched4_kway_rqrcpend (Timeout) 1669 - shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpbegin (Timeout) 1670 - shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpend (Timeout) 1671 - shm_example_simple_lap_z_facto1_sched4_not_tqrcpbegin (Timeout) 1676 - shm_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpend (Timeout) 1681 - shm_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtbegin (Timeout) 1684 - shm_example_simple_lap_z_facto1_sched4_kway_pqrcpilu1 (Timeout) 1707 - shm_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpbegin (Timeout) 1710 - shm_example_simple_lap_z_facto2_sched4_not_rqrrtend (Timeout) 1714 - shm_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtend (Timeout) 1720 - shm_example_simple_lap_z_facto3_sched4_kway_svdend (Timeout) 1745 - shm_example_simple_lap_z_facto3_sched4_kwayprojections_rqrrtbegin (Timeout) 1747 - shm_example_simple_lap_z_facto3_sched4_kway_pqrcpilu0 (Timeout) 1752 - shm_example_simple_lap_z_facto4_sched4_kway_svdend (Timeout) 1756 - shm_example_simple_lap_z_facto4_sched4_not_pqrcpend (Timeout) 1823 - c_mpi_rep_example_simple_solve_and_refine_lap_c_facto4 (Timeout) 1885 - c_mpi_rep_example_step-by-step_single_rsa (Timeout) 1888 - c_mpi_rep_example_step-by-step_single_mm2 (Timeout) 1896 - c_mpi_rep_example_refinement_lap_d_refine_gmres_sym (Timeout) 1903 - c_mpi_rep_example_refinement_lap_c_refine_bicgstab_sym (Timeout) 1905 - c_mpi_rep_example_refinement_lap_z_refine_gmres_her (Timeout) 1906 - c_mpi_rep_example_refinement_lap_z_refine_bicgstab_her (Timeout) 1907 - c_mpi_rep_example_refinement_lap_z_refine_cg_sym (Timeout) 1908 - c_mpi_rep_example_refinement_lap_z_refine_gmres_sym (Timeout) 1909 - c_mpi_rep_example_refinement_lap_z_refine_bicgstab_sym (Timeout) 1912 - c_mpi_rep_example_simple_mixed_refine_bicgstab (Timeout) 1916 - c_mpi_rep_example_simple_mixed_lap_z_refine_cg_her (Timeout) 1919 - c_mpi_rep_example_simple_mixed_lap_z_refine_cg_sym (Timeout) 1926 - mpi_rep_example_simple_lap_d_facto1_sched0_1d (Timeout) 1928 - mpi_rep_example_simple_lap_c_facto0_sched0_1d (Timeout) 1935 - mpi_rep_example_simple_lap_z_facto2_sched0_1d (Timeout) 1939 - mpi_rep_example_simple_lap_s_facto1_sched1_1d (Timeout) 1940 - mpi_rep_example_simple_lap_s_facto2_sched1_1d (Timeout) 1944 - mpi_rep_example_simple_lap_c_facto0_sched1_1d (Timeout) 1952 - mpi_rep_example_simple_lap_z_facto3_sched1_1d (Timeout) 1956 - mpi_rep_example_simple_lap_s_facto2_sched4_1d (Timeout) 1957 - mpi_rep_example_simple_lap_d_facto0_sched4_1d (Timeout) 1958 - mpi_rep_example_simple_lap_d_facto1_sched4_1d (Timeout) 1961 - mpi_rep_example_simple_lap_c_facto1_sched4_1d (Timeout) 1962 - mpi_rep_example_simple_lap_c_facto2_sched4_1d (Timeout) 1963 - mpi_rep_example_simple_lap_c_facto3_sched4_1d (Timeout) 1964 - mpi_rep_example_simple_lap_c_facto4_sched4_1d (Timeout) 1966 - mpi_rep_example_simple_lap_z_facto1_sched4_1d (Timeout) 1967 - mpi_rep_example_simple_lap_z_facto2_sched4_1d (Timeout) 1968 - mpi_rep_example_simple_lap_z_facto3_sched4_1d (Timeout) 1969 - mpi_rep_example_simple_lap_z_facto4_sched4_1d (Timeout) 1970 - mpi_dst_example_simple_lap_s_facto0_sched0_1d (Timeout) 1971 - mpi_dst_example_simple_lap_s_facto1_sched0_1d (Timeout) 1973 - mpi_dst_example_simple_lap_d_facto0_sched0_1d (Timeout) 1976 - mpi_dst_example_simple_lap_c_facto0_sched0_1d (Timeout) 1978 - mpi_dst_example_simple_lap_c_facto2_sched0_1d (Timeout) 1979 - mpi_dst_example_simple_lap_c_facto3_sched0_1d (Timeout) 1980 - mpi_dst_example_simple_lap_c_facto4_sched0_1d (Timeout) 1981 - mpi_dst_example_simple_lap_z_facto0_sched0_1d (Timeout) 1982 - mpi_dst_example_simple_lap_z_facto1_sched0_1d (Timeout) 1983 - mpi_dst_example_simple_lap_z_facto2_sched0_1d (Timeout) 1984 - mpi_dst_example_simple_lap_z_facto3_sched0_1d (Timeout) 1985 - mpi_dst_example_simple_lap_z_facto4_sched0_1d (Timeout) 1986 - mpi_dst_example_simple_lap_s_facto0_sched1_1d (Timeout) 1987 - mpi_dst_example_simple_lap_s_facto1_sched1_1d (Timeout) 1989 - mpi_dst_example_simple_lap_d_facto0_sched1_1d (Timeout) 1990 - mpi_dst_example_simple_lap_d_facto1_sched1_1d (Timeout) 1991 - mpi_dst_example_simple_lap_d_facto2_sched1_1d (Timeout) 1992 - mpi_dst_example_simple_lap_c_facto0_sched1_1d (Timeout) 1993 - mpi_dst_example_simple_lap_c_facto1_sched1_1d (Timeout) 1994 - mpi_dst_example_simple_lap_c_facto2_sched1_1d (Timeout) 1995 - mpi_dst_example_simple_lap_c_facto3_sched1_1d (Timeout) 1996 - mpi_dst_example_simple_lap_c_facto4_sched1_1d (Timeout) 1997 - mpi_dst_example_simple_lap_z_facto0_sched1_1d (Timeout) 2001 - mpi_dst_example_simple_lap_z_facto4_sched1_1d (Timeout) 2002 - mpi_dst_example_simple_lap_s_facto0_sched4_1d (Timeout) 2003 - mpi_dst_example_simple_lap_s_facto1_sched4_1d (Timeout) 2004 - mpi_dst_example_simple_lap_s_facto2_sched4_1d (Timeout) 2005 - mpi_dst_example_simple_lap_d_facto0_sched4_1d (Timeout) 2006 - mpi_dst_example_simple_lap_d_facto1_sched4_1d (Timeout) 2007 - mpi_dst_example_simple_lap_d_facto2_sched4_1d (Timeout) 2008 - mpi_dst_example_simple_lap_c_facto0_sched4_1d (Timeout) 2009 - mpi_dst_example_simple_lap_c_facto1_sched4_1d (Timeout) 2010 - mpi_dst_example_simple_lap_c_facto2_sched4_1d (Timeout) 2014 - mpi_dst_example_simple_lap_z_facto1_sched4_1d (Timeout) 2019 - mpi_dst_example_simple_lap_s_facto0_sched0_not_svdend (Timeout) 2020 - mpi_dst_example_simple_lap_s_facto0_sched0_kway_svdbegin (Timeout) 2022 - mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_svdbegin (Timeout) 2025 - mpi_dst_example_simple_lap_s_facto0_sched0_not_pqrcpend (Timeout) 2027 - mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpend (Timeout) 2028 - mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_pqrcpbegin (Timeout) 2030 - mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrcpbegin (Timeout) 2032 - mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrcpbegin (Timeout) 2033 - mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrcpend (Timeout) 2034 - mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpbegin (Timeout) 2035 - mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrcpend (Timeout) 2036 - mpi_dst_example_simple_lap_s_facto0_sched0_not_tqrcpbegin (Timeout) 2037 - mpi_dst_example_simple_lap_s_facto0_sched0_not_tqrcpend (Timeout) 2038 - mpi_dst_example_simple_lap_s_facto0_sched0_kway_tqrcpbegin (Timeout) 2039 - mpi_dst_example_simple_lap_s_facto0_sched0_kway_tqrcpend (Timeout) 2040 - mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpbegin (Timeout) 2041 - mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_tqrcpend (Timeout) 2042 - mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrrtbegin (Timeout) 2043 - mpi_dst_example_simple_lap_s_facto0_sched0_not_rqrrtend (Timeout) 2044 - mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrrtbegin (Timeout) 2045 - mpi_dst_example_simple_lap_s_facto0_sched0_kway_rqrrtend (Timeout) 2046 - mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtbegin (Timeout) 2047 - mpi_dst_example_simple_lap_s_facto0_sched0_kwayprojections_rqrrtend (Timeout) 2048 - mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpilu0 (Timeout) 2049 - mpi_dst_example_simple_lap_s_facto0_sched0_kway_pqrcpilu1 (Timeout) 2050 - mpi_dst_example_simple_lap_s_facto1_sched0_not_svdbegin (Timeout) 2051 - mpi_dst_example_simple_lap_s_facto1_sched0_not_svdend (Timeout) 2053 - mpi_dst_example_simple_lap_s_facto1_sched0_kway_svdend (Timeout) 2054 - mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_svdbegin (Timeout) 2055 - mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_svdend (Timeout) 2056 - mpi_dst_example_simple_lap_s_facto1_sched0_not_pqrcpbegin (Timeout) 2057 - mpi_dst_example_simple_lap_s_facto1_sched0_not_pqrcpend (Timeout) 2058 - mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpbegin (Timeout) 2059 - mpi_dst_example_simple_lap_s_facto1_sched0_kway_pqrcpend (Timeout) 2060 - mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_pqrcpbegin (Timeout) 2063 - mpi_dst_example_simple_lap_s_facto1_sched0_not_rqrcpend (Timeout) 2064 - mpi_dst_example_simple_lap_s_facto1_sched0_kway_rqrcpbegin (Timeout) 2066 - mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_rqrcpbegin (Timeout) 2068 - mpi_dst_example_simple_lap_s_facto1_sched0_not_tqrcpbegin (Timeout) 2069 - mpi_dst_example_simple_lap_s_facto1_sched0_not_tqrcpend (Timeout) 2070 - mpi_dst_example_simple_lap_s_facto1_sched0_kway_tqrcpbegin (Timeout) 2071 - mpi_dst_example_simple_lap_s_facto1_sched0_kway_tqrcpend (Timeout) 2072 - mpi_dst_example_simple_lap_s_facto1_sched0_kwayprojections_tqrcpbegin (Timeout) 2082 - mpi_dst_example_simple_lap_s_facto2_sched0_not_svdbegin (Timeout) 2087 - mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_svdend (Timeout) 2088 - mpi_dst_example_simple_lap_s_facto2_sched0_not_pqrcpbegin (Timeout) 2089 - mpi_dst_example_simple_lap_s_facto2_sched0_not_pqrcpend (Timeout) 2090 - mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpbegin (Timeout) 2091 - mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpend (Timeout) 2093 - mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_pqrcpend (Timeout) 2094 - mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrcpbegin (Timeout) 2095 - mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrcpend (Timeout) 2105 - mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_tqrcpend (Timeout) 2106 - mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrrtbegin (Timeout) 2107 - mpi_dst_example_simple_lap_s_facto2_sched0_not_rqrrtend (Timeout) 2108 - mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrrtbegin (Timeout) 2109 - mpi_dst_example_simple_lap_s_facto2_sched0_kway_rqrrtend (Timeout) 2110 - mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtbegin (Timeout) 2111 - mpi_dst_example_simple_lap_s_facto2_sched0_kwayprojections_rqrrtend (Timeout) 2112 - mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpilu0 (Timeout) 2113 - mpi_dst_example_simple_lap_s_facto2_sched0_kway_pqrcpilu1 (Timeout) 2114 - mpi_dst_example_simple_lap_d_facto0_sched0_not_svdbegin (Timeout) 2115 - mpi_dst_example_simple_lap_d_facto0_sched0_not_svdend (Timeout) 2116 - mpi_dst_example_simple_lap_d_facto0_sched0_kway_svdbegin (Timeout) 2117 - mpi_dst_example_simple_lap_d_facto0_sched0_kway_svdend (Timeout) 2118 - mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_svdbegin (Timeout) 2119 - mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_svdend (Timeout) 2120 - mpi_dst_example_simple_lap_d_facto0_sched0_not_pqrcpbegin (Timeout) 2121 - mpi_dst_example_simple_lap_d_facto0_sched0_not_pqrcpend (Timeout) 2122 - mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpbegin (Timeout) 2123 - mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpend (Timeout) 2124 - mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpbegin (Timeout) 2125 - mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_pqrcpend (Timeout) 2126 - mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrcpbegin (Timeout) 2127 - mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrcpend (Timeout) 2128 - mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrcpbegin (Timeout) 2129 - mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrcpend (Timeout) 2130 - mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpbegin (Timeout) 2131 - mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrcpend (Timeout) 2132 - mpi_dst_example_simple_lap_d_facto0_sched0_not_tqrcpbegin (Timeout) 2133 - mpi_dst_example_simple_lap_d_facto0_sched0_not_tqrcpend (Timeout) 2134 - mpi_dst_example_simple_lap_d_facto0_sched0_kway_tqrcpbegin (Timeout) 2135 - mpi_dst_example_simple_lap_d_facto0_sched0_kway_tqrcpend (Timeout) 2136 - mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpbegin (Timeout) 2137 - mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_tqrcpend (Timeout) 2138 - mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrrtbegin (Timeout) 2139 - mpi_dst_example_simple_lap_d_facto0_sched0_not_rqrrtend (Timeout) 2140 - mpi_dst_example_simple_lap_d_facto0_sched0_kway_rqrrtbegin (Timeout) 2142 - mpi_dst_example_simple_lap_d_facto0_sched0_kwayprojections_rqrrtbegin (Timeout) 2145 - mpi_dst_example_simple_lap_d_facto0_sched0_kway_pqrcpilu1 (Timeout) 2146 - mpi_dst_example_simple_lap_d_facto1_sched0_not_svdbegin (Timeout) 2147 - mpi_dst_example_simple_lap_d_facto1_sched0_not_svdend (Timeout) 2148 - mpi_dst_example_simple_lap_d_facto1_sched0_kway_svdbegin (Timeout) 2150 - mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_svdbegin (Timeout) 2151 - mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_svdend (Timeout) 2152 - mpi_dst_example_simple_lap_d_facto1_sched0_not_pqrcpbegin (Timeout) 2153 - mpi_dst_example_simple_lap_d_facto1_sched0_not_pqrcpend (Timeout) 2155 - mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpend (Timeout) 2157 - mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_pqrcpend (Timeout) 2159 - mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrcpend (Timeout) 2160 - mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrcpbegin (Timeout) 2161 - mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrcpend (Timeout) 2163 - mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrcpend (Timeout) 2164 - mpi_dst_example_simple_lap_d_facto1_sched0_not_tqrcpbegin (Timeout) 2165 - mpi_dst_example_simple_lap_d_facto1_sched0_not_tqrcpend (Timeout) 2166 - mpi_dst_example_simple_lap_d_facto1_sched0_kway_tqrcpbegin (Timeout) 2167 - mpi_dst_example_simple_lap_d_facto1_sched0_kway_tqrcpend (Timeout) 2168 - mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_tqrcpbegin (Timeout) 2171 - mpi_dst_example_simple_lap_d_facto1_sched0_not_rqrrtend (Timeout) 2172 - mpi_dst_example_simple_lap_d_facto1_sched0_kway_rqrrtbegin (Timeout) 2174 - mpi_dst_example_simple_lap_d_facto1_sched0_kwayprojections_rqrrtbegin (Timeout) 2176 - mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpilu0 (Timeout) 2177 - mpi_dst_example_simple_lap_d_facto1_sched0_kway_pqrcpilu1 (Timeout) 2178 - mpi_dst_example_simple_lap_d_facto2_sched0_not_svdbegin (Timeout) 2179 - mpi_dst_example_simple_lap_d_facto2_sched0_not_svdend (Timeout) 2183 - mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_svdend (Timeout) 2184 - mpi_dst_example_simple_lap_d_facto2_sched0_not_pqrcpbegin (Timeout) 2185 - mpi_dst_example_simple_lap_d_facto2_sched0_not_pqrcpend (Timeout) 2186 - mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpbegin (Timeout) 2187 - mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpend (Timeout) 2188 - mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpbegin (Timeout) 2189 - mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_pqrcpend (Timeout) 2190 - mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrcpbegin (Timeout) 2192 - mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrcpbegin (Timeout) 2194 - mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpbegin (Timeout) 2195 - mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrcpend (Timeout) 2196 - mpi_dst_example_simple_lap_d_facto2_sched0_not_tqrcpbegin (Timeout) 2197 - mpi_dst_example_simple_lap_d_facto2_sched0_not_tqrcpend (Timeout) 2198 - mpi_dst_example_simple_lap_d_facto2_sched0_kway_tqrcpbegin (Timeout) 2199 - mpi_dst_example_simple_lap_d_facto2_sched0_kway_tqrcpend (Timeout) 2200 - mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpbegin (Timeout) 2201 - mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_tqrcpend (Timeout) 2202 - mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrrtbegin (Timeout) 2203 - mpi_dst_example_simple_lap_d_facto2_sched0_not_rqrrtend (Timeout) 2204 - mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrrtbegin (Timeout) 2205 - mpi_dst_example_simple_lap_d_facto2_sched0_kway_rqrrtend (Timeout) 2206 - mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtbegin (Timeout) 2207 - mpi_dst_example_simple_lap_d_facto2_sched0_kwayprojections_rqrrtend (Timeout) 2208 - mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpilu0 (Timeout) 2209 - mpi_dst_example_simple_lap_d_facto2_sched0_kway_pqrcpilu1 (Timeout) 2210 - mpi_dst_example_simple_lap_c_facto0_sched0_not_svdbegin (Timeout) 2211 - mpi_dst_example_simple_lap_c_facto0_sched0_not_svdend (Timeout) 2212 - mpi_dst_example_simple_lap_c_facto0_sched0_kway_svdbegin (Timeout) 2213 - mpi_dst_example_simple_lap_c_facto0_sched0_kway_svdend (Timeout) 2214 - mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_svdbegin (Timeout) 2215 - mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_svdend (Timeout) 2216 - mpi_dst_example_simple_lap_c_facto0_sched0_not_pqrcpbegin (Timeout) 2217 - mpi_dst_example_simple_lap_c_facto0_sched0_not_pqrcpend (Timeout) 2218 - mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpbegin (Timeout) 2220 - mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpbegin (Timeout) 2221 - mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_pqrcpend (Timeout) 2222 - mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrcpbegin (Timeout) 2225 - mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrcpend (Timeout) 2226 - mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpbegin (Timeout) 2227 - mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrcpend (Timeout) 2228 - mpi_dst_example_simple_lap_c_facto0_sched0_not_tqrcpbegin (Timeout) 2229 - mpi_dst_example_simple_lap_c_facto0_sched0_not_tqrcpend (Timeout) 2230 - mpi_dst_example_simple_lap_c_facto0_sched0_kway_tqrcpbegin (Timeout) 2231 - mpi_dst_example_simple_lap_c_facto0_sched0_kway_tqrcpend (Timeout) 2232 - mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpbegin (Timeout) 2233 - mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_tqrcpend (Timeout) 2234 - mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrrtbegin (Timeout) 2235 - mpi_dst_example_simple_lap_c_facto0_sched0_not_rqrrtend (Timeout) 2236 - mpi_dst_example_simple_lap_c_facto0_sched0_kway_rqrrtbegin (Timeout) 2238 - mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtbegin (Timeout) 2239 - mpi_dst_example_simple_lap_c_facto0_sched0_kwayprojections_rqrrtend (Timeout) 2240 - mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpilu0 (Timeout) 2241 - mpi_dst_example_simple_lap_c_facto0_sched0_kway_pqrcpilu1 (Timeout) 2242 - mpi_dst_example_simple_lap_c_facto1_sched0_not_svdbegin (Timeout) 2243 - mpi_dst_example_simple_lap_c_facto1_sched0_not_svdend (Timeout) 2244 - mpi_dst_example_simple_lap_c_facto1_sched0_kway_svdbegin (Timeout) 2245 - mpi_dst_example_simple_lap_c_facto1_sched0_kway_svdend (Timeout) 2246 - mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_svdbegin (Timeout) 2247 - mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_svdend (Timeout) 2248 - mpi_dst_example_simple_lap_c_facto1_sched0_not_pqrcpbegin (Timeout) 2249 - mpi_dst_example_simple_lap_c_facto1_sched0_not_pqrcpend (Timeout) 2250 - mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpbegin (Timeout) 2251 - mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpend (Timeout) 2252 - mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpbegin (Timeout) 2253 - mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_pqrcpend (Timeout) 2254 - mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrcpbegin (Timeout) 2255 - mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrcpend (Timeout) 2256 - mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrcpbegin (Timeout) 2257 - mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrcpend (Timeout) 2258 - mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpbegin (Timeout) 2259 - mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrcpend (Timeout) 2260 - mpi_dst_example_simple_lap_c_facto1_sched0_not_tqrcpbegin (Timeout) 2261 - mpi_dst_example_simple_lap_c_facto1_sched0_not_tqrcpend (Timeout) 2262 - mpi_dst_example_simple_lap_c_facto1_sched0_kway_tqrcpbegin (Timeout) 2263 - mpi_dst_example_simple_lap_c_facto1_sched0_kway_tqrcpend (Timeout) 2264 - mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpbegin (Timeout) 2265 - mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_tqrcpend (Timeout) 2266 - mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrrtbegin (Timeout) 2267 - mpi_dst_example_simple_lap_c_facto1_sched0_not_rqrrtend (Timeout) 2269 - mpi_dst_example_simple_lap_c_facto1_sched0_kway_rqrrtend (Timeout) 2270 - mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtbegin (Timeout) 2271 - mpi_dst_example_simple_lap_c_facto1_sched0_kwayprojections_rqrrtend (Timeout) 2272 - mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpilu0 (Timeout) 2273 - mpi_dst_example_simple_lap_c_facto1_sched0_kway_pqrcpilu1 (Timeout) 2274 - mpi_dst_example_simple_lap_c_facto2_sched0_not_svdbegin (Timeout) 2276 - mpi_dst_example_simple_lap_c_facto2_sched0_kway_svdbegin (Timeout) 2277 - mpi_dst_example_simple_lap_c_facto2_sched0_kway_svdend (Timeout) 2278 - mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_svdbegin (Timeout) 2279 - mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_svdend (Timeout) 2280 - mpi_dst_example_simple_lap_c_facto2_sched0_not_pqrcpbegin (Timeout) 2281 - mpi_dst_example_simple_lap_c_facto2_sched0_not_pqrcpend (Timeout) 2282 - mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpbegin (Timeout) 2283 - mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpend (Timeout) 2284 - mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpbegin (Timeout) 2285 - mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_pqrcpend (Timeout) 2286 - mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrcpbegin (Timeout) 2287 - mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrcpend (Timeout) 2288 - mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrcpbegin (Timeout) 2289 - mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrcpend (Timeout) 2290 - mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpbegin (Timeout) 2291 - mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrcpend (Timeout) 2292 - mpi_dst_example_simple_lap_c_facto2_sched0_not_tqrcpbegin (Timeout) 2293 - mpi_dst_example_simple_lap_c_facto2_sched0_not_tqrcpend (Timeout) 2294 - mpi_dst_example_simple_lap_c_facto2_sched0_kway_tqrcpbegin (Timeout) 2295 - mpi_dst_example_simple_lap_c_facto2_sched0_kway_tqrcpend (Timeout) 2296 - mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpbegin (Timeout) 2297 - mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_tqrcpend (Timeout) 2298 - mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrrtbegin (Timeout) 2299 - mpi_dst_example_simple_lap_c_facto2_sched0_not_rqrrtend (Timeout) 2300 - mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrrtbegin (Timeout) 2301 - mpi_dst_example_simple_lap_c_facto2_sched0_kway_rqrrtend (Timeout) 2302 - mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtbegin (Timeout) 2303 - mpi_dst_example_simple_lap_c_facto2_sched0_kwayprojections_rqrrtend (Timeout) 2304 - mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpilu0 (Timeout) 2305 - mpi_dst_example_simple_lap_c_facto2_sched0_kway_pqrcpilu1 (Timeout) 2306 - mpi_dst_example_simple_lap_c_facto3_sched0_not_svdbegin (Timeout) 2307 - mpi_dst_example_simple_lap_c_facto3_sched0_not_svdend (Timeout) 2308 - mpi_dst_example_simple_lap_c_facto3_sched0_kway_svdbegin (Timeout) 2309 - mpi_dst_example_simple_lap_c_facto3_sched0_kway_svdend (Timeout) 2310 - mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_svdbegin (Timeout) 2311 - mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_svdend (Timeout) 2312 - mpi_dst_example_simple_lap_c_facto3_sched0_not_pqrcpbegin (Timeout) 2313 - mpi_dst_example_simple_lap_c_facto3_sched0_not_pqrcpend (Timeout) 2315 - mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpend (Timeout) 2316 - mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpbegin (Timeout) 2317 - mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_pqrcpend (Timeout) 2318 - mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrcpbegin (Timeout) 2319 - mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrcpend (Timeout) 2320 - mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrcpbegin (Timeout) 2321 - mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrcpend (Timeout) 2322 - mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpbegin (Timeout) 2323 - mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrcpend (Timeout) 2324 - mpi_dst_example_simple_lap_c_facto3_sched0_not_tqrcpbegin (Timeout) 2325 - mpi_dst_example_simple_lap_c_facto3_sched0_not_tqrcpend (Timeout) 2326 - mpi_dst_example_simple_lap_c_facto3_sched0_kway_tqrcpbegin (Timeout) 2327 - mpi_dst_example_simple_lap_c_facto3_sched0_kway_tqrcpend (Timeout) 2328 - mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpbegin (Timeout) 2329 - mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_tqrcpend (Timeout) 2330 - mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrrtbegin (Timeout) 2331 - mpi_dst_example_simple_lap_c_facto3_sched0_not_rqrrtend (Timeout) 2332 - mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrrtbegin (Timeout) 2333 - mpi_dst_example_simple_lap_c_facto3_sched0_kway_rqrrtend (Timeout) 2334 - mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtbegin (Timeout) 2335 - mpi_dst_example_simple_lap_c_facto3_sched0_kwayprojections_rqrrtend (Timeout) 2336 - mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpilu0 (Timeout) 2337 - mpi_dst_example_simple_lap_c_facto3_sched0_kway_pqrcpilu1 (Timeout) 2338 - mpi_dst_example_simple_lap_c_facto4_sched0_not_svdbegin (Timeout) 2339 - mpi_dst_example_simple_lap_c_facto4_sched0_not_svdend (Timeout) 2340 - mpi_dst_example_simple_lap_c_facto4_sched0_kway_svdbegin (Timeout) 2341 - mpi_dst_example_simple_lap_c_facto4_sched0_kway_svdend (Timeout) 2342 - mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_svdbegin (Timeout) 2343 - mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_svdend (Timeout) 2344 - mpi_dst_example_simple_lap_c_facto4_sched0_not_pqrcpbegin (Timeout) 2346 - mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpbegin (Timeout) 2347 - mpi_dst_example_simple_lap_c_facto4_sched0_kway_pqrcpend (Timeout) 2348 - mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpbegin (Timeout) 2349 - mpi_dst_example_simple_lap_c_facto4_sched0_kwayprojections_pqrcpend (Timeout) 2356 - mpi_dst_example_simple_lap_c_facto4_sched0_not_tqrcpbegin (Timeout) 2388 - mpi_dst_example_simple_lap_z_facto0_sched0_not_tqrcpbegin (Timeout) 2389 - mpi_dst_example_simple_lap_z_facto0_sched0_not_tqrcpend (Timeout) 2390 - mpi_dst_example_simple_lap_z_facto0_sched0_kway_tqrcpbegin (Timeout) 2391 - mpi_dst_example_simple_lap_z_facto0_sched0_kway_tqrcpend (Timeout) 2392 - mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpbegin (Timeout) 2393 - mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_tqrcpend (Timeout) 2394 - mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrrtbegin (Timeout) 2395 - mpi_dst_example_simple_lap_z_facto0_sched0_not_rqrrtend (Timeout) 2396 - mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrrtbegin (Timeout) 2397 - mpi_dst_example_simple_lap_z_facto0_sched0_kway_rqrrtend (Timeout) 2398 - mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtbegin (Timeout) 2399 - mpi_dst_example_simple_lap_z_facto0_sched0_kwayprojections_rqrrtend (Timeout) 2400 - mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpilu0 (Timeout) 2401 - mpi_dst_example_simple_lap_z_facto0_sched0_kway_pqrcpilu1 (Timeout) 2402 - mpi_dst_example_simple_lap_z_facto1_sched0_not_svdbegin (Timeout) 2403 - mpi_dst_example_simple_lap_z_facto1_sched0_not_svdend (Timeout) 2404 - mpi_dst_example_simple_lap_z_facto1_sched0_kway_svdbegin (Timeout) 2405 - mpi_dst_example_simple_lap_z_facto1_sched0_kway_svdend (Timeout) 2406 - mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_svdbegin (Timeout) 2408 - mpi_dst_example_simple_lap_z_facto1_sched0_not_pqrcpbegin (Timeout) 2409 - mpi_dst_example_simple_lap_z_facto1_sched0_not_pqrcpend (Timeout) 2410 - mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpbegin (Timeout) 2412 - mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_pqrcpbegin (Timeout) 2414 - mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrcpbegin (Timeout) 2415 - mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrcpend (Timeout) 2416 - mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrcpbegin (Timeout) 2417 - mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrcpend (Timeout) 2418 - mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpbegin (Timeout) 2419 - mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrcpend (Timeout) 2420 - mpi_dst_example_simple_lap_z_facto1_sched0_not_tqrcpbegin (Timeout) 2421 - mpi_dst_example_simple_lap_z_facto1_sched0_not_tqrcpend (Timeout) 2423 - mpi_dst_example_simple_lap_z_facto1_sched0_kway_tqrcpend (Timeout) 2425 - mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_tqrcpend (Timeout) 2427 - mpi_dst_example_simple_lap_z_facto1_sched0_not_rqrrtend (Timeout) 2428 - mpi_dst_example_simple_lap_z_facto1_sched0_kway_rqrrtbegin (Timeout) 2430 - mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtbegin (Timeout) 2431 - mpi_dst_example_simple_lap_z_facto1_sched0_kwayprojections_rqrrtend (Timeout) 2433 - mpi_dst_example_simple_lap_z_facto1_sched0_kway_pqrcpilu1 (Timeout) 2435 - mpi_dst_example_simple_lap_z_facto2_sched0_not_svdend (Timeout) 2436 - mpi_dst_example_simple_lap_z_facto2_sched0_kway_svdbegin (Timeout) 2438 - mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_svdbegin (Timeout) 2439 - mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_svdend (Timeout) 2440 - mpi_dst_example_simple_lap_z_facto2_sched0_not_pqrcpbegin (Timeout) 2441 - mpi_dst_example_simple_lap_z_facto2_sched0_not_pqrcpend (Timeout) 2442 - mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpbegin (Timeout) 2443 - mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpend (Timeout) 2444 - mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpbegin (Timeout) 2445 - mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_pqrcpend (Timeout) 2446 - mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrcpbegin (Timeout) 2447 - mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrcpend (Timeout) 2448 - mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrcpbegin (Timeout) 2450 - mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpbegin (Timeout) 2451 - mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrcpend (Timeout) 2452 - mpi_dst_example_simple_lap_z_facto2_sched0_not_tqrcpbegin (Timeout) 2453 - mpi_dst_example_simple_lap_z_facto2_sched0_not_tqrcpend (Timeout) 2454 - mpi_dst_example_simple_lap_z_facto2_sched0_kway_tqrcpbegin (Timeout) 2455 - mpi_dst_example_simple_lap_z_facto2_sched0_kway_tqrcpend (Timeout) 2456 - mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpbegin (Timeout) 2457 - mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_tqrcpend (Timeout) 2458 - mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrrtbegin (Timeout) 2459 - mpi_dst_example_simple_lap_z_facto2_sched0_not_rqrrtend (Timeout) 2460 - mpi_dst_example_simple_lap_z_facto2_sched0_kway_rqrrtbegin (Timeout) 2462 - mpi_dst_example_simple_lap_z_facto2_sched0_kwayprojections_rqrrtbegin (Timeout) 2464 - mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpilu0 (Timeout) 2465 - mpi_dst_example_simple_lap_z_facto2_sched0_kway_pqrcpilu1 (Timeout) 2466 - mpi_dst_example_simple_lap_z_facto3_sched0_not_svdbegin (Timeout) 2467 - mpi_dst_example_simple_lap_z_facto3_sched0_not_svdend (Timeout) 2468 - mpi_dst_example_simple_lap_z_facto3_sched0_kway_svdbegin (Timeout) 2469 - mpi_dst_example_simple_lap_z_facto3_sched0_kway_svdend (Timeout) 2470 - mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_svdbegin (Timeout) 2472 - mpi_dst_example_simple_lap_z_facto3_sched0_not_pqrcpbegin (Timeout) 2473 - mpi_dst_example_simple_lap_z_facto3_sched0_not_pqrcpend (Timeout) 2474 - mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpbegin (Timeout) 2475 - mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpend (Timeout) 2476 - mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpbegin (Timeout) 2477 - mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_pqrcpend (Timeout) 2478 - mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrcpbegin (Timeout) 2479 - mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrcpend (Timeout) 2480 - mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrcpbegin (Timeout) 2481 - mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrcpend (Timeout) 2482 - mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpbegin (Timeout) 2483 - mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrcpend (Timeout) 2484 - mpi_dst_example_simple_lap_z_facto3_sched0_not_tqrcpbegin (Timeout) 2485 - mpi_dst_example_simple_lap_z_facto3_sched0_not_tqrcpend (Timeout) 2486 - mpi_dst_example_simple_lap_z_facto3_sched0_kway_tqrcpbegin (Timeout) 2487 - mpi_dst_example_simple_lap_z_facto3_sched0_kway_tqrcpend (Timeout) 2488 - mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpbegin (Timeout) 2489 - mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_tqrcpend (Timeout) 2490 - mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrrtbegin (Timeout) 2491 - mpi_dst_example_simple_lap_z_facto3_sched0_not_rqrrtend (Timeout) 2492 - mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrrtbegin (Timeout) 2493 - mpi_dst_example_simple_lap_z_facto3_sched0_kway_rqrrtend (Timeout) 2494 - mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtbegin (Timeout) 2495 - mpi_dst_example_simple_lap_z_facto3_sched0_kwayprojections_rqrrtend (Timeout) 2496 - mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpilu0 (Timeout) 2497 - mpi_dst_example_simple_lap_z_facto3_sched0_kway_pqrcpilu1 (Timeout) 2498 - mpi_dst_example_simple_lap_z_facto4_sched0_not_svdbegin (Timeout) 2499 - mpi_dst_example_simple_lap_z_facto4_sched0_not_svdend (Timeout) 2500 - mpi_dst_example_simple_lap_z_facto4_sched0_kway_svdbegin (Timeout) 2501 - mpi_dst_example_simple_lap_z_facto4_sched0_kway_svdend (Timeout) 2502 - mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_svdbegin (Timeout) 2503 - mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_svdend (Timeout) 2504 - mpi_dst_example_simple_lap_z_facto4_sched0_not_pqrcpbegin (Timeout) 2505 - mpi_dst_example_simple_lap_z_facto4_sched0_not_pqrcpend (Timeout) 2506 - mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpbegin (Timeout) 2507 - mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpend (Timeout) 2508 - mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpbegin (Timeout) 2509 - mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_pqrcpend (Timeout) 2510 - mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrcpbegin (Timeout) 2511 - mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrcpend (Timeout) 2512 - mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrcpbegin (Timeout) 2513 - mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrcpend (Timeout) 2514 - mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpbegin (Timeout) 2515 - mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrcpend (Timeout) 2516 - mpi_dst_example_simple_lap_z_facto4_sched0_not_tqrcpbegin (Timeout) 2517 - mpi_dst_example_simple_lap_z_facto4_sched0_not_tqrcpend (Timeout) 2518 - mpi_dst_example_simple_lap_z_facto4_sched0_kway_tqrcpbegin (Timeout) 2519 - mpi_dst_example_simple_lap_z_facto4_sched0_kway_tqrcpend (Timeout) 2520 - mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpbegin (Timeout) 2521 - mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_tqrcpend (Timeout) 2522 - mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrrtbegin (Timeout) 2523 - mpi_dst_example_simple_lap_z_facto4_sched0_not_rqrrtend (Timeout) 2524 - mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrrtbegin (Timeout) 2525 - mpi_dst_example_simple_lap_z_facto4_sched0_kway_rqrrtend (Timeout) 2526 - mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtbegin (Timeout) 2527 - mpi_dst_example_simple_lap_z_facto4_sched0_kwayprojections_rqrrtend (Timeout) 2528 - mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpilu0 (Timeout) 2529 - mpi_dst_example_simple_lap_z_facto4_sched0_kway_pqrcpilu1 (Timeout) 2530 - mpi_dst_example_simple_lap_s_facto0_sched1_not_svdbegin (Timeout) 2531 - mpi_dst_example_simple_lap_s_facto0_sched1_not_svdend (Timeout) 2532 - mpi_dst_example_simple_lap_s_facto0_sched1_kway_svdbegin (Timeout) 2533 - mpi_dst_example_simple_lap_s_facto0_sched1_kway_svdend (Timeout) 2534 - mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_svdbegin (Timeout) 2535 - mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_svdend (Timeout) 2536 - mpi_dst_example_simple_lap_s_facto0_sched1_not_pqrcpbegin (Timeout) 2537 - mpi_dst_example_simple_lap_s_facto0_sched1_not_pqrcpend (Timeout) 2538 - mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpbegin (Timeout) 2539 - mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpend (Timeout) 2540 - mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpbegin (Timeout) 2541 - mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_pqrcpend (Timeout) 2542 - mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrcpbegin (Timeout) 2543 - mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrcpend (Timeout) 2544 - mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrcpbegin (Timeout) 2545 - mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrcpend (Timeout) 2546 - mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpbegin (Timeout) 2547 - mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrcpend (Timeout) 2548 - mpi_dst_example_simple_lap_s_facto0_sched1_not_tqrcpbegin (Timeout) 2549 - mpi_dst_example_simple_lap_s_facto0_sched1_not_tqrcpend (Timeout) 2550 - mpi_dst_example_simple_lap_s_facto0_sched1_kway_tqrcpbegin (Timeout) 2551 - mpi_dst_example_simple_lap_s_facto0_sched1_kway_tqrcpend (Timeout) 2552 - mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpbegin (Timeout) 2553 - mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_tqrcpend (Timeout) 2554 - mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrrtbegin (Timeout) 2555 - mpi_dst_example_simple_lap_s_facto0_sched1_not_rqrrtend (Timeout) 2556 - mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrrtbegin (Timeout) 2557 - mpi_dst_example_simple_lap_s_facto0_sched1_kway_rqrrtend (Timeout) 2558 - mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtbegin (Timeout) 2559 - mpi_dst_example_simple_lap_s_facto0_sched1_kwayprojections_rqrrtend (Timeout) 2560 - mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpilu0 (Timeout) 2561 - mpi_dst_example_simple_lap_s_facto0_sched1_kway_pqrcpilu1 (Timeout) 2562 - mpi_dst_example_simple_lap_s_facto1_sched1_not_svdbegin (Timeout) 2563 - mpi_dst_example_simple_lap_s_facto1_sched1_not_svdend (Timeout) 2564 - mpi_dst_example_simple_lap_s_facto1_sched1_kway_svdbegin (Timeout) 2565 - mpi_dst_example_simple_lap_s_facto1_sched1_kway_svdend (Timeout) 2566 - mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_svdbegin (Timeout) 2567 - mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_svdend (Timeout) 2568 - mpi_dst_example_simple_lap_s_facto1_sched1_not_pqrcpbegin (Timeout) 2569 - mpi_dst_example_simple_lap_s_facto1_sched1_not_pqrcpend (Timeout) 2570 - mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpbegin (Timeout) 2571 - mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpend (Timeout) 2572 - mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpbegin (Timeout) 2573 - mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_pqrcpend (Timeout) 2574 - mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrcpbegin (Timeout) 2575 - mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrcpend (Timeout) 2576 - mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrcpbegin (Timeout) 2577 - mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrcpend (Timeout) 2578 - mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpbegin (Timeout) 2579 - mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrcpend (Timeout) 2580 - mpi_dst_example_simple_lap_s_facto1_sched1_not_tqrcpbegin (Timeout) 2581 - mpi_dst_example_simple_lap_s_facto1_sched1_not_tqrcpend (Timeout) 2582 - mpi_dst_example_simple_lap_s_facto1_sched1_kway_tqrcpbegin (Timeout) 2583 - mpi_dst_example_simple_lap_s_facto1_sched1_kway_tqrcpend (Timeout) 2584 - mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpbegin (Timeout) 2585 - mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_tqrcpend (Timeout) 2586 - mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrrtbegin (Timeout) 2587 - mpi_dst_example_simple_lap_s_facto1_sched1_not_rqrrtend (Timeout) 2588 - mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrrtbegin (Timeout) 2589 - mpi_dst_example_simple_lap_s_facto1_sched1_kway_rqrrtend (Timeout) 2590 - mpi_dst_example_simple_lap_s_facto1_sched1_kwayprojections_rqrrtbegin (Timeout) 2591 - mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpilu0 (Timeout) 2592 - mpi_dst_example_simple_lap_s_facto1_sched1_kway_pqrcpilu1 (Timeout) 2593 - mpi_dst_example_simple_lap_s_facto2_sched1_not_svdbegin (Timeout) 2594 - mpi_dst_example_simple_lap_s_facto2_sched1_not_svdend (Timeout) 2595 - mpi_dst_example_simple_lap_s_facto2_sched1_kway_svdbegin (Timeout) 2596 - mpi_dst_example_simple_lap_s_facto2_sched1_kway_svdend (Timeout) 2597 - mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_svdbegin (Timeout) 2598 - mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_svdend (Timeout) 2599 - mpi_dst_example_simple_lap_s_facto2_sched1_not_pqrcpbegin (Timeout) 2600 - mpi_dst_example_simple_lap_s_facto2_sched1_not_pqrcpend (Timeout) 2601 - mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpbegin (Timeout) 2602 - mpi_dst_example_simple_lap_s_facto2_sched1_kway_pqrcpend (Timeout) 2603 - mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpbegin (Timeout) 2604 - mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_pqrcpend (Timeout) 2605 - mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrcpbegin (Timeout) 2606 - mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrcpend (Timeout) 2607 - mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrcpbegin (Timeout) 2608 - mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrcpend (Timeout) 2609 - mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpbegin (Timeout) 2610 - mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrcpend (Timeout) 2611 - mpi_dst_example_simple_lap_s_facto2_sched1_not_tqrcpbegin (Timeout) 2612 - mpi_dst_example_simple_lap_s_facto2_sched1_not_tqrcpend (Timeout) 2614 - mpi_dst_example_simple_lap_s_facto2_sched1_kway_tqrcpend (Timeout) 2615 - mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpbegin (Timeout) 2616 - mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_tqrcpend (Timeout) 2617 - mpi_dst_example_simple_lap_s_facto2_sched1_not_rqrrtbegin (Timeout) 2619 - mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrrtbegin (Timeout) 2620 - mpi_dst_example_simple_lap_s_facto2_sched1_kway_rqrrtend (Timeout) 2622 - mpi_dst_example_simple_lap_s_facto2_sched1_kwayprojections_rqrrtend (Timeout) 2625 - mpi_dst_example_simple_lap_d_facto0_sched1_not_svdbegin (Timeout) 2626 - mpi_dst_example_simple_lap_d_facto0_sched1_not_svdend (Timeout) 2627 - mpi_dst_example_simple_lap_d_facto0_sched1_kway_svdbegin (Timeout) 2629 - mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_svdbegin (Timeout) 2630 - mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_svdend (Timeout) 2631 - mpi_dst_example_simple_lap_d_facto0_sched1_not_pqrcpbegin (Timeout) 2633 - mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpbegin (Timeout) 2636 - mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_pqrcpend (Timeout) 2637 - mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrcpbegin (Timeout) 2639 - mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrcpbegin (Timeout) 2642 - mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrcpend (Timeout) 2648 - mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_tqrcpend (Timeout) 2649 - mpi_dst_example_simple_lap_d_facto0_sched1_not_rqrrtbegin (Timeout) 2651 - mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrrtbegin (Timeout) 2652 - mpi_dst_example_simple_lap_d_facto0_sched1_kway_rqrrtend (Timeout) 2653 - mpi_dst_example_simple_lap_d_facto0_sched1_kwayprojections_rqrrtbegin (Timeout) 2655 - mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpilu0 (Timeout) 2656 - mpi_dst_example_simple_lap_d_facto0_sched1_kway_pqrcpilu1 (Timeout) 2657 - mpi_dst_example_simple_lap_d_facto1_sched1_not_svdbegin (Timeout) 2659 - mpi_dst_example_simple_lap_d_facto1_sched1_kway_svdbegin (Timeout) 2660 - mpi_dst_example_simple_lap_d_facto1_sched1_kway_svdend (Timeout) 2661 - mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_svdbegin (Timeout) 2662 - mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_svdend (Timeout) 2664 - mpi_dst_example_simple_lap_d_facto1_sched1_not_pqrcpend (Timeout) 2665 - mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpbegin (Timeout) 2666 - mpi_dst_example_simple_lap_d_facto1_sched1_kway_pqrcpend (Timeout) 2667 - mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpbegin (Timeout) 2668 - mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_pqrcpend (Timeout) 2669 - mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrcpbegin (Timeout) 2670 - mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrcpend (Timeout) 2671 - mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrcpbegin (Timeout) 2672 - mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrcpend (Timeout) 2673 - mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrcpbegin (Timeout) 2675 - mpi_dst_example_simple_lap_d_facto1_sched1_not_tqrcpbegin (Timeout) 2676 - mpi_dst_example_simple_lap_d_facto1_sched1_not_tqrcpend (Timeout) 2677 - mpi_dst_example_simple_lap_d_facto1_sched1_kway_tqrcpbegin (Timeout) 2678 - mpi_dst_example_simple_lap_d_facto1_sched1_kway_tqrcpend (Timeout) 2679 - mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpbegin (Timeout) 2680 - mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_tqrcpend (Timeout) 2681 - mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrrtbegin (Timeout) 2682 - mpi_dst_example_simple_lap_d_facto1_sched1_not_rqrrtend (Timeout) 2683 - mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrrtbegin (Timeout) 2684 - mpi_dst_example_simple_lap_d_facto1_sched1_kway_rqrrtend (Timeout) 2686 - mpi_dst_example_simple_lap_d_facto1_sched1_kwayprojections_rqrrtend (Timeout) 2689 - mpi_dst_example_simple_lap_d_facto2_sched1_not_svdbegin (Timeout) 2690 - mpi_dst_example_simple_lap_d_facto2_sched1_not_svdend (Timeout) 2692 - mpi_dst_example_simple_lap_d_facto2_sched1_kway_svdend (Timeout) 2693 - mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_svdbegin (Timeout) 2696 - mpi_dst_example_simple_lap_d_facto2_sched1_not_pqrcpend (Timeout) 2697 - mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpbegin (Timeout) 2698 - mpi_dst_example_simple_lap_d_facto2_sched1_kway_pqrcpend (Timeout) 2699 - mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpbegin (Timeout) 2700 - mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_pqrcpend (Timeout) 2701 - mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrcpbegin (Timeout) 2702 - mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrcpend (Timeout) 2703 - mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrcpbegin (Timeout) 2704 - mpi_dst_example_simple_lap_d_facto2_sched1_kway_rqrcpend (Timeout) 2705 - mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpbegin (Timeout) 2706 - mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_rqrcpend (Timeout) 2707 - mpi_dst_example_simple_lap_d_facto2_sched1_not_tqrcpbegin (Timeout) 2709 - mpi_dst_example_simple_lap_d_facto2_sched1_kway_tqrcpbegin (Timeout) 2710 - mpi_dst_example_simple_lap_d_facto2_sched1_kway_tqrcpend (Timeout) 2711 - mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpbegin (Timeout) 2712 - mpi_dst_example_simple_lap_d_facto2_sched1_kwayprojections_tqrcpend (Timeout) 2713 - mpi_dst_example_simple_lap_d_facto2_sched1_not_rqrrtbegin (Timeout) 2732 - mpi_dst_example_simple_lap_c_facto0_sched1_kwayprojections_pqrcpend (Failed) 2807 - mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_tqrcpbegin (Timeout) 2810 - mpi_dst_example_simple_lap_c_facto2_sched1_not_rqrrtend (Timeout) 2811 - mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrrtbegin (Timeout) 2812 - mpi_dst_example_simple_lap_c_facto2_sched1_kway_rqrrtend (Timeout) 2814 - mpi_dst_example_simple_lap_c_facto2_sched1_kwayprojections_rqrrtend (Timeout) 2834 - mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrcpend (Timeout) 2835 - mpi_dst_example_simple_lap_c_facto3_sched1_not_tqrcpbegin (Timeout) 2836 - mpi_dst_example_simple_lap_c_facto3_sched1_not_tqrcpend (Timeout) 2837 - mpi_dst_example_simple_lap_c_facto3_sched1_kway_tqrcpbegin (Timeout) 2843 - mpi_dst_example_simple_lap_c_facto3_sched1_kway_rqrrtbegin (Timeout) 2846 - mpi_dst_example_simple_lap_c_facto3_sched1_kwayprojections_rqrrtend (Timeout) 2847 - mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpilu0 (Timeout) 2848 - mpi_dst_example_simple_lap_c_facto3_sched1_kway_pqrcpilu1 (Timeout) 2851 - mpi_dst_example_simple_lap_c_facto4_sched1_kway_svdbegin (Timeout) 2853 - mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_svdbegin (Timeout) 2855 - mpi_dst_example_simple_lap_c_facto4_sched1_not_pqrcpbegin (Failed) 2858 - mpi_dst_example_simple_lap_c_facto4_sched1_kway_pqrcpend (Timeout) 2860 - mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_pqrcpend (Timeout) 2862 - mpi_dst_example_simple_lap_c_facto4_sched1_not_rqrcpend (Timeout) 2863 - mpi_dst_example_simple_lap_c_facto4_sched1_kway_rqrcpbegin (Timeout) 2865 - mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrcpbegin (Timeout) 2870 - mpi_dst_example_simple_lap_c_facto4_sched1_kway_tqrcpend (Timeout) 2877 - mpi_dst_example_simple_lap_c_facto4_sched1_kwayprojections_rqrrtbegin (Timeout) 2882 - mpi_dst_example_simple_lap_z_facto0_sched1_not_svdend (Timeout) 2891 - mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_pqrcpbegin (Timeout) 2894 - mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrcpend (Timeout) 2897 - mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpbegin (Timeout) 2898 - mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrcpend (Timeout) 2899 - mpi_dst_example_simple_lap_z_facto0_sched1_not_tqrcpbegin (Timeout) 2904 - mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_tqrcpend (Timeout) 2905 - mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrrtbegin (Timeout) 2906 - mpi_dst_example_simple_lap_z_facto0_sched1_not_rqrrtend (Timeout) 2907 - mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrrtbegin (Timeout) 2908 - mpi_dst_example_simple_lap_z_facto0_sched1_kway_rqrrtend (Timeout) 2909 - mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtbegin (Timeout) 2910 - mpi_dst_example_simple_lap_z_facto0_sched1_kwayprojections_rqrrtend (Timeout) 2912 - mpi_dst_example_simple_lap_z_facto0_sched1_kway_pqrcpilu1 (Timeout) 2913 - mpi_dst_example_simple_lap_z_facto1_sched1_not_svdbegin (Timeout) 2915 - mpi_dst_example_simple_lap_z_facto1_sched1_kway_svdbegin (Timeout) 2916 - mpi_dst_example_simple_lap_z_facto1_sched1_kway_svdend (Timeout) 2917 - mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_svdbegin (Timeout) 2918 - mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_svdend (Timeout) 2919 - mpi_dst_example_simple_lap_z_facto1_sched1_not_pqrcpbegin (Timeout) 2920 - mpi_dst_example_simple_lap_z_facto1_sched1_not_pqrcpend (Timeout) 2921 - mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpbegin (Timeout) 2922 - mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpend (Timeout) 2923 - mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpbegin (Timeout) 2924 - mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_pqrcpend (Timeout) 2925 - mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrcpbegin (Timeout) 2926 - mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrcpend (Timeout) 2927 - mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrcpbegin (Timeout) 2928 - mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrcpend (Timeout) 2929 - mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpbegin (Timeout) 2930 - mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrcpend (Timeout) 2931 - mpi_dst_example_simple_lap_z_facto1_sched1_not_tqrcpbegin (Timeout) 2932 - mpi_dst_example_simple_lap_z_facto1_sched1_not_tqrcpend (Timeout) 2933 - mpi_dst_example_simple_lap_z_facto1_sched1_kway_tqrcpbegin (Timeout) 2934 - mpi_dst_example_simple_lap_z_facto1_sched1_kway_tqrcpend (Timeout) 2935 - mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpbegin (Timeout) 2936 - mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_tqrcpend (Timeout) 2937 - mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrrtbegin (Timeout) 2938 - mpi_dst_example_simple_lap_z_facto1_sched1_not_rqrrtend (Timeout) 2939 - mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrrtbegin (Timeout) 2940 - mpi_dst_example_simple_lap_z_facto1_sched1_kway_rqrrtend (Timeout) 2941 - mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtbegin (Timeout) 2942 - mpi_dst_example_simple_lap_z_facto1_sched1_kwayprojections_rqrrtend (Timeout) 2943 - mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpilu0 (Timeout) 2944 - mpi_dst_example_simple_lap_z_facto1_sched1_kway_pqrcpilu1 (Timeout) 2945 - mpi_dst_example_simple_lap_z_facto2_sched1_not_svdbegin (Timeout) 2946 - mpi_dst_example_simple_lap_z_facto2_sched1_not_svdend (Timeout) 2947 - mpi_dst_example_simple_lap_z_facto2_sched1_kway_svdbegin (Timeout) 2948 - mpi_dst_example_simple_lap_z_facto2_sched1_kway_svdend (Timeout) 2949 - mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_svdbegin (Timeout) 2950 - mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_svdend (Timeout) 2951 - mpi_dst_example_simple_lap_z_facto2_sched1_not_pqrcpbegin (Timeout) 2952 - mpi_dst_example_simple_lap_z_facto2_sched1_not_pqrcpend (Timeout) 2953 - mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpbegin (Timeout) 2954 - mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpend (Timeout) 2955 - mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpbegin (Timeout) 2956 - mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_pqrcpend (Timeout) 2957 - mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrcpbegin (Timeout) 2958 - mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrcpend (Timeout) 2959 - mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrcpbegin (Timeout) 2960 - mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrcpend (Timeout) 2961 - mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpbegin (Timeout) 2962 - mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrcpend (Timeout) 2963 - mpi_dst_example_simple_lap_z_facto2_sched1_not_tqrcpbegin (Timeout) 2964 - mpi_dst_example_simple_lap_z_facto2_sched1_not_tqrcpend (Timeout) 2965 - mpi_dst_example_simple_lap_z_facto2_sched1_kway_tqrcpbegin (Timeout) 2966 - mpi_dst_example_simple_lap_z_facto2_sched1_kway_tqrcpend (Timeout) 2967 - mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpbegin (Timeout) 2968 - mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_tqrcpend (Timeout) 2969 - mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrrtbegin (Timeout) 2970 - mpi_dst_example_simple_lap_z_facto2_sched1_not_rqrrtend (Timeout) 2971 - mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrrtbegin (Timeout) 2972 - mpi_dst_example_simple_lap_z_facto2_sched1_kway_rqrrtend (Timeout) 2973 - mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtbegin (Timeout) 2974 - mpi_dst_example_simple_lap_z_facto2_sched1_kwayprojections_rqrrtend (Timeout) 2975 - mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpilu0 (Timeout) 2976 - mpi_dst_example_simple_lap_z_facto2_sched1_kway_pqrcpilu1 (Timeout) 2977 - mpi_dst_example_simple_lap_z_facto3_sched1_not_svdbegin (Timeout) 2978 - mpi_dst_example_simple_lap_z_facto3_sched1_not_svdend (Timeout) 2979 - mpi_dst_example_simple_lap_z_facto3_sched1_kway_svdbegin (Timeout) 2980 - mpi_dst_example_simple_lap_z_facto3_sched1_kway_svdend (Timeout) 2981 - mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_svdbegin (Timeout) 2982 - mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_svdend (Timeout) 2983 - mpi_dst_example_simple_lap_z_facto3_sched1_not_pqrcpbegin (Timeout) 2985 - mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpbegin (Timeout) 2986 - mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpend (Timeout) 2987 - mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpbegin (Timeout) 2988 - mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_pqrcpend (Timeout) 2989 - mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrcpbegin (Timeout) 2990 - mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrcpend (Timeout) 2991 - mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrcpbegin (Timeout) 2992 - mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrcpend (Timeout) 2993 - mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpbegin (Timeout) 2994 - mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrcpend (Timeout) 2995 - mpi_dst_example_simple_lap_z_facto3_sched1_not_tqrcpbegin (Timeout) 2996 - mpi_dst_example_simple_lap_z_facto3_sched1_not_tqrcpend (Timeout) 2997 - mpi_dst_example_simple_lap_z_facto3_sched1_kway_tqrcpbegin (Timeout) 2998 - mpi_dst_example_simple_lap_z_facto3_sched1_kway_tqrcpend (Timeout) 2999 - mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpbegin (Timeout) 3000 - mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_tqrcpend (Timeout) 3001 - mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrrtbegin (Timeout) 3002 - mpi_dst_example_simple_lap_z_facto3_sched1_not_rqrrtend (Timeout) 3003 - mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrrtbegin (Timeout) 3004 - mpi_dst_example_simple_lap_z_facto3_sched1_kway_rqrrtend (Timeout) 3005 - mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtbegin (Timeout) 3006 - mpi_dst_example_simple_lap_z_facto3_sched1_kwayprojections_rqrrtend (Timeout) 3007 - mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpilu0 (Timeout) 3008 - mpi_dst_example_simple_lap_z_facto3_sched1_kway_pqrcpilu1 (Timeout) 3009 - mpi_dst_example_simple_lap_z_facto4_sched1_not_svdbegin (Timeout) 3010 - mpi_dst_example_simple_lap_z_facto4_sched1_not_svdend (Timeout) 3011 - mpi_dst_example_simple_lap_z_facto4_sched1_kway_svdbegin (Timeout) 3012 - mpi_dst_example_simple_lap_z_facto4_sched1_kway_svdend (Timeout) 3013 - mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_svdbegin (Timeout) 3014 - mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_svdend (Timeout) 3015 - mpi_dst_example_simple_lap_z_facto4_sched1_not_pqrcpbegin (Timeout) 3016 - mpi_dst_example_simple_lap_z_facto4_sched1_not_pqrcpend (Timeout) 3017 - mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpbegin (Timeout) 3018 - mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpend (Timeout) 3019 - mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpbegin (Timeout) 3020 - mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_pqrcpend (Timeout) 3021 - mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrcpbegin (Timeout) 3022 - mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrcpend (Timeout) 3023 - mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrcpbegin (Timeout) 3024 - mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrcpend (Timeout) 3025 - mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpbegin (Timeout) 3026 - mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrcpend (Timeout) 3027 - mpi_dst_example_simple_lap_z_facto4_sched1_not_tqrcpbegin (Timeout) 3028 - mpi_dst_example_simple_lap_z_facto4_sched1_not_tqrcpend (Timeout) 3029 - mpi_dst_example_simple_lap_z_facto4_sched1_kway_tqrcpbegin (Timeout) 3030 - mpi_dst_example_simple_lap_z_facto4_sched1_kway_tqrcpend (Timeout) 3031 - mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpbegin (Timeout) 3032 - mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_tqrcpend (Timeout) 3033 - mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrrtbegin (Timeout) 3034 - mpi_dst_example_simple_lap_z_facto4_sched1_not_rqrrtend (Timeout) 3035 - mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrrtbegin (Timeout) 3036 - mpi_dst_example_simple_lap_z_facto4_sched1_kway_rqrrtend (Timeout) 3037 - mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtbegin (Timeout) 3038 - mpi_dst_example_simple_lap_z_facto4_sched1_kwayprojections_rqrrtend (Timeout) 3039 - mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpilu0 (Timeout) 3040 - mpi_dst_example_simple_lap_z_facto4_sched1_kway_pqrcpilu1 (Timeout) 3041 - mpi_dst_example_simple_lap_s_facto0_sched4_not_svdbegin (Timeout) 3042 - mpi_dst_example_simple_lap_s_facto0_sched4_not_svdend (Timeout) 3043 - mpi_dst_example_simple_lap_s_facto0_sched4_kway_svdbegin (Timeout) 3044 - mpi_dst_example_simple_lap_s_facto0_sched4_kway_svdend (Timeout) 3045 - mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_svdbegin (Timeout) 3046 - mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_svdend (Timeout) 3047 - mpi_dst_example_simple_lap_s_facto0_sched4_not_pqrcpbegin (Timeout) 3048 - mpi_dst_example_simple_lap_s_facto0_sched4_not_pqrcpend (Timeout) 3049 - mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpbegin (Timeout) 3050 - mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpend (Timeout) 3051 - mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpbegin (Timeout) 3052 - mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_pqrcpend (Timeout) 3053 - mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrcpbegin (Timeout) 3054 - mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrcpend (Timeout) 3055 - mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrcpbegin (Timeout) 3056 - mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrcpend (Timeout) 3057 - mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpbegin (Timeout) 3058 - mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrcpend (Timeout) 3059 - mpi_dst_example_simple_lap_s_facto0_sched4_not_tqrcpbegin (Timeout) 3060 - mpi_dst_example_simple_lap_s_facto0_sched4_not_tqrcpend (Timeout) 3061 - mpi_dst_example_simple_lap_s_facto0_sched4_kway_tqrcpbegin (Timeout) 3062 - mpi_dst_example_simple_lap_s_facto0_sched4_kway_tqrcpend (Timeout) 3063 - mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_tqrcpbegin (Timeout) 3065 - mpi_dst_example_simple_lap_s_facto0_sched4_not_rqrrtbegin (Timeout) 3067 - mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrrtbegin (Timeout) 3068 - mpi_dst_example_simple_lap_s_facto0_sched4_kway_rqrrtend (Timeout) 3069 - mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtbegin (Timeout) 3070 - mpi_dst_example_simple_lap_s_facto0_sched4_kwayprojections_rqrrtend (Timeout) 3071 - mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpilu0 (Timeout) 3072 - mpi_dst_example_simple_lap_s_facto0_sched4_kway_pqrcpilu1 (Timeout) 3073 - mpi_dst_example_simple_lap_s_facto1_sched4_not_svdbegin (Timeout) 3075 - mpi_dst_example_simple_lap_s_facto1_sched4_kway_svdbegin (Timeout) 3076 - mpi_dst_example_simple_lap_s_facto1_sched4_kway_svdend (Timeout) 3077 - mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_svdbegin (Timeout) 3078 - mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_svdend (Timeout) 3079 - mpi_dst_example_simple_lap_s_facto1_sched4_not_pqrcpbegin (Timeout) 3081 - mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpbegin (Timeout) 3082 - mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpend (Timeout) 3083 - mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_pqrcpbegin (Timeout) 3085 - mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrcpbegin (Timeout) 3087 - mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrcpbegin (Timeout) 3088 - mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrcpend (Timeout) 3090 - mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrcpend (Timeout) 3091 - mpi_dst_example_simple_lap_s_facto1_sched4_not_tqrcpbegin (Timeout) 3092 - mpi_dst_example_simple_lap_s_facto1_sched4_not_tqrcpend (Timeout) 3093 - mpi_dst_example_simple_lap_s_facto1_sched4_kway_tqrcpbegin (Timeout) 3094 - mpi_dst_example_simple_lap_s_facto1_sched4_kway_tqrcpend (Timeout) 3095 - mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpbegin (Timeout) 3096 - mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_tqrcpend (Timeout) 3098 - mpi_dst_example_simple_lap_s_facto1_sched4_not_rqrrtend (Timeout) 3100 - mpi_dst_example_simple_lap_s_facto1_sched4_kway_rqrrtend (Timeout) 3101 - mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtbegin (Timeout) 3102 - mpi_dst_example_simple_lap_s_facto1_sched4_kwayprojections_rqrrtend (Timeout) 3103 - mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpilu0 (Timeout) 3104 - mpi_dst_example_simple_lap_s_facto1_sched4_kway_pqrcpilu1 (Timeout) 3105 - mpi_dst_example_simple_lap_s_facto2_sched4_not_svdbegin (Timeout) 3107 - mpi_dst_example_simple_lap_s_facto2_sched4_kway_svdbegin (Timeout) 3108 - mpi_dst_example_simple_lap_s_facto2_sched4_kway_svdend (Timeout) 3110 - mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_svdend (Timeout) 3111 - mpi_dst_example_simple_lap_s_facto2_sched4_not_pqrcpbegin (Timeout) 3112 - mpi_dst_example_simple_lap_s_facto2_sched4_not_pqrcpend (Timeout) 3113 - mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpbegin (Timeout) 3115 - mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpbegin (Timeout) 3116 - mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_pqrcpend (Timeout) 3117 - mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrcpbegin (Timeout) 3118 - mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrcpend (Timeout) 3119 - mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrcpbegin (Timeout) 3121 - mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpbegin (Timeout) 3122 - mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrcpend (Timeout) 3123 - mpi_dst_example_simple_lap_s_facto2_sched4_not_tqrcpbegin (Timeout) 3124 - mpi_dst_example_simple_lap_s_facto2_sched4_not_tqrcpend (Timeout) 3125 - mpi_dst_example_simple_lap_s_facto2_sched4_kway_tqrcpbegin (Timeout) 3126 - mpi_dst_example_simple_lap_s_facto2_sched4_kway_tqrcpend (Timeout) 3127 - mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpbegin (Timeout) 3128 - mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_tqrcpend (Timeout) 3129 - mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrrtbegin (Timeout) 3130 - mpi_dst_example_simple_lap_s_facto2_sched4_not_rqrrtend (Timeout) 3131 - mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrrtbegin (Timeout) 3132 - mpi_dst_example_simple_lap_s_facto2_sched4_kway_rqrrtend (Timeout) 3133 - mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtbegin (Timeout) 3134 - mpi_dst_example_simple_lap_s_facto2_sched4_kwayprojections_rqrrtend (Timeout) 3135 - mpi_dst_example_simple_lap_s_facto2_sched4_kway_pqrcpilu0 (Timeout) 3137 - mpi_dst_example_simple_lap_d_facto0_sched4_not_svdbegin (Timeout) 3138 - mpi_dst_example_simple_lap_d_facto0_sched4_not_svdend (Timeout) 3139 - mpi_dst_example_simple_lap_d_facto0_sched4_kway_svdbegin (Timeout) 3140 - mpi_dst_example_simple_lap_d_facto0_sched4_kway_svdend (Timeout) 3141 - mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_svdbegin (Timeout) 3142 - mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_svdend (Timeout) 3143 - mpi_dst_example_simple_lap_d_facto0_sched4_not_pqrcpbegin (Timeout) 3144 - mpi_dst_example_simple_lap_d_facto0_sched4_not_pqrcpend (Timeout) 3145 - mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpbegin (Timeout) 3146 - mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpend (Timeout) 3147 - mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpbegin (Timeout) 3148 - mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_pqrcpend (Timeout) 3149 - mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrcpbegin (Timeout) 3150 - mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrcpend (Timeout) 3151 - mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrcpbegin (Timeout) 3153 - mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpbegin (Timeout) 3154 - mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrcpend (Timeout) 3155 - mpi_dst_example_simple_lap_d_facto0_sched4_not_tqrcpbegin (Timeout) 3156 - mpi_dst_example_simple_lap_d_facto0_sched4_not_tqrcpend (Timeout) 3157 - mpi_dst_example_simple_lap_d_facto0_sched4_kway_tqrcpbegin (Timeout) 3159 - mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpbegin (Timeout) 3160 - mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_tqrcpend (Timeout) 3161 - mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrrtbegin (Timeout) 3162 - mpi_dst_example_simple_lap_d_facto0_sched4_not_rqrrtend (Timeout) 3163 - mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrrtbegin (Timeout) 3164 - mpi_dst_example_simple_lap_d_facto0_sched4_kway_rqrrtend (Timeout) 3166 - mpi_dst_example_simple_lap_d_facto0_sched4_kwayprojections_rqrrtend (Timeout) 3167 - mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpilu0 (Timeout) 3168 - mpi_dst_example_simple_lap_d_facto0_sched4_kway_pqrcpilu1 (Timeout) 3169 - mpi_dst_example_simple_lap_d_facto1_sched4_not_svdbegin (Timeout) 3170 - mpi_dst_example_simple_lap_d_facto1_sched4_not_svdend (Timeout) 3174 - mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_svdend (Timeout) 3175 - mpi_dst_example_simple_lap_d_facto1_sched4_not_pqrcpbegin (Timeout) 3176 - mpi_dst_example_simple_lap_d_facto1_sched4_not_pqrcpend (Timeout) 3177 - mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpbegin (Timeout) 3178 - mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpend (Timeout) 3179 - mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpbegin (Timeout) 3180 - mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_pqrcpend (Timeout) 3181 - mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrcpbegin (Timeout) 3182 - mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrcpend (Timeout) 3183 - mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrcpbegin (Timeout) 3186 - mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrcpend (Timeout) 3189 - mpi_dst_example_simple_lap_d_facto1_sched4_kway_tqrcpbegin (Timeout) 3190 - mpi_dst_example_simple_lap_d_facto1_sched4_kway_tqrcpend (Timeout) 3191 - mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpbegin (Timeout) 3192 - mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_tqrcpend (Timeout) 3193 - mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrrtbegin (Timeout) 3194 - mpi_dst_example_simple_lap_d_facto1_sched4_not_rqrrtend (Timeout) 3195 - mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrrtbegin (Timeout) 3196 - mpi_dst_example_simple_lap_d_facto1_sched4_kway_rqrrtend (Timeout) 3197 - mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtbegin (Timeout) 3198 - mpi_dst_example_simple_lap_d_facto1_sched4_kwayprojections_rqrrtend (Timeout) 3199 - mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpilu0 (Timeout) 3200 - mpi_dst_example_simple_lap_d_facto1_sched4_kway_pqrcpilu1 (Timeout) 3201 - mpi_dst_example_simple_lap_d_facto2_sched4_not_svdbegin (Timeout) 3202 - mpi_dst_example_simple_lap_d_facto2_sched4_not_svdend (Timeout) 3203 - mpi_dst_example_simple_lap_d_facto2_sched4_kway_svdbegin (Timeout) 3204 - mpi_dst_example_simple_lap_d_facto2_sched4_kway_svdend (Timeout) 3205 - mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_svdbegin (Timeout) 3206 - mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_svdend (Timeout) 3207 - mpi_dst_example_simple_lap_d_facto2_sched4_not_pqrcpbegin (Timeout) 3208 - mpi_dst_example_simple_lap_d_facto2_sched4_not_pqrcpend (Timeout) 3209 - mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpbegin (Timeout) 3210 - mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpend (Timeout) 3211 - mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpbegin (Timeout) 3212 - mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_pqrcpend (Timeout) 3213 - mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrcpbegin (Timeout) 3214 - mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrcpend (Timeout) 3215 - mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrcpbegin (Timeout) 3216 - mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrcpend (Timeout) 3217 - mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpbegin (Timeout) 3218 - mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrcpend (Timeout) 3219 - mpi_dst_example_simple_lap_d_facto2_sched4_not_tqrcpbegin (Timeout) 3220 - mpi_dst_example_simple_lap_d_facto2_sched4_not_tqrcpend (Timeout) 3221 - mpi_dst_example_simple_lap_d_facto2_sched4_kway_tqrcpbegin (Timeout) 3222 - mpi_dst_example_simple_lap_d_facto2_sched4_kway_tqrcpend (Timeout) 3223 - mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpbegin (Timeout) 3224 - mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_tqrcpend (Timeout) 3225 - mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrrtbegin (Timeout) 3226 - mpi_dst_example_simple_lap_d_facto2_sched4_not_rqrrtend (Timeout) 3227 - mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrrtbegin (Timeout) 3228 - mpi_dst_example_simple_lap_d_facto2_sched4_kway_rqrrtend (Timeout) 3229 - mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtbegin (Timeout) 3230 - mpi_dst_example_simple_lap_d_facto2_sched4_kwayprojections_rqrrtend (Timeout) 3231 - mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpilu0 (Timeout) 3232 - mpi_dst_example_simple_lap_d_facto2_sched4_kway_pqrcpilu1 (Timeout) 3233 - mpi_dst_example_simple_lap_c_facto0_sched4_not_svdbegin (Timeout) 3234 - mpi_dst_example_simple_lap_c_facto0_sched4_not_svdend (Timeout) 3235 - mpi_dst_example_simple_lap_c_facto0_sched4_kway_svdbegin (Timeout) 3236 - mpi_dst_example_simple_lap_c_facto0_sched4_kway_svdend (Timeout) 3237 - mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_svdbegin (Timeout) 3238 - mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_svdend (Timeout) 3239 - mpi_dst_example_simple_lap_c_facto0_sched4_not_pqrcpbegin (Timeout) 3240 - mpi_dst_example_simple_lap_c_facto0_sched4_not_pqrcpend (Timeout) 3241 - mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpbegin (Timeout) 3242 - mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpend (Timeout) 3243 - mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpbegin (Timeout) 3244 - mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_pqrcpend (Timeout) 3245 - mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrcpbegin (Timeout) 3246 - mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrcpend (Timeout) 3247 - mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrcpbegin (Timeout) 3248 - mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrcpend (Timeout) 3249 - mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpbegin (Timeout) 3250 - mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrcpend (Timeout) 3251 - mpi_dst_example_simple_lap_c_facto0_sched4_not_tqrcpbegin (Timeout) 3252 - mpi_dst_example_simple_lap_c_facto0_sched4_not_tqrcpend (Timeout) 3253 - mpi_dst_example_simple_lap_c_facto0_sched4_kway_tqrcpbegin (Timeout) 3254 - mpi_dst_example_simple_lap_c_facto0_sched4_kway_tqrcpend (Timeout) 3255 - mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpbegin (Timeout) 3256 - mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_tqrcpend (Timeout) 3257 - mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrrtbegin (Timeout) 3258 - mpi_dst_example_simple_lap_c_facto0_sched4_not_rqrrtend (Timeout) 3259 - mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrrtbegin (Timeout) 3260 - mpi_dst_example_simple_lap_c_facto0_sched4_kway_rqrrtend (Timeout) 3261 - mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtbegin (Timeout) 3262 - mpi_dst_example_simple_lap_c_facto0_sched4_kwayprojections_rqrrtend (Timeout) 3263 - mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpilu0 (Timeout) 3264 - mpi_dst_example_simple_lap_c_facto0_sched4_kway_pqrcpilu1 (Timeout) 3265 - mpi_dst_example_simple_lap_c_facto1_sched4_not_svdbegin (Timeout) 3266 - mpi_dst_example_simple_lap_c_facto1_sched4_not_svdend (Timeout) 3267 - mpi_dst_example_simple_lap_c_facto1_sched4_kway_svdbegin (Timeout) 3268 - mpi_dst_example_simple_lap_c_facto1_sched4_kway_svdend (Timeout) 3269 - mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_svdbegin (Timeout) 3270 - mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_svdend (Timeout) 3271 - mpi_dst_example_simple_lap_c_facto1_sched4_not_pqrcpbegin (Timeout) 3272 - mpi_dst_example_simple_lap_c_facto1_sched4_not_pqrcpend (Timeout) 3273 - mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpbegin (Timeout) 3274 - mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpend (Timeout) 3275 - mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpbegin (Timeout) 3276 - mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_pqrcpend (Timeout) 3277 - mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrcpbegin (Timeout) 3278 - mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrcpend (Timeout) 3279 - mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrcpbegin (Timeout) 3280 - mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrcpend (Timeout) 3281 - mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpbegin (Timeout) 3282 - mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrcpend (Timeout) 3283 - mpi_dst_example_simple_lap_c_facto1_sched4_not_tqrcpbegin (Timeout) 3284 - mpi_dst_example_simple_lap_c_facto1_sched4_not_tqrcpend (Timeout) 3285 - mpi_dst_example_simple_lap_c_facto1_sched4_kway_tqrcpbegin (Timeout) 3286 - mpi_dst_example_simple_lap_c_facto1_sched4_kway_tqrcpend (Timeout) 3287 - mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpbegin (Timeout) 3288 - mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_tqrcpend (Timeout) 3289 - mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrrtbegin (Timeout) 3290 - mpi_dst_example_simple_lap_c_facto1_sched4_not_rqrrtend (Timeout) 3291 - mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrrtbegin (Timeout) 3292 - mpi_dst_example_simple_lap_c_facto1_sched4_kway_rqrrtend (Timeout) 3293 - mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtbegin (Timeout) 3294 - mpi_dst_example_simple_lap_c_facto1_sched4_kwayprojections_rqrrtend (Timeout) 3295 - mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpilu0 (Timeout) 3296 - mpi_dst_example_simple_lap_c_facto1_sched4_kway_pqrcpilu1 (Timeout) 3297 - mpi_dst_example_simple_lap_c_facto2_sched4_not_svdbegin (Timeout) 3298 - mpi_dst_example_simple_lap_c_facto2_sched4_not_svdend (Timeout) 3299 - mpi_dst_example_simple_lap_c_facto2_sched4_kway_svdbegin (Timeout) 3300 - mpi_dst_example_simple_lap_c_facto2_sched4_kway_svdend (Timeout) 3301 - mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_svdbegin (Timeout) 3302 - mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_svdend (Timeout) 3303 - mpi_dst_example_simple_lap_c_facto2_sched4_not_pqrcpbegin (Timeout) 3304 - mpi_dst_example_simple_lap_c_facto2_sched4_not_pqrcpend (Timeout) 3305 - mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpbegin (Timeout) 3306 - mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpend (Timeout) 3307 - mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpbegin (Timeout) 3308 - mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_pqrcpend (Timeout) 3309 - mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrcpbegin (Timeout) 3310 - mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrcpend (Timeout) 3311 - mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrcpbegin (Timeout) 3312 - mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrcpend (Timeout) 3313 - mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpbegin (Timeout) 3314 - mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrcpend (Timeout) 3315 - mpi_dst_example_simple_lap_c_facto2_sched4_not_tqrcpbegin (Timeout) 3316 - mpi_dst_example_simple_lap_c_facto2_sched4_not_tqrcpend (Timeout) 3317 - mpi_dst_example_simple_lap_c_facto2_sched4_kway_tqrcpbegin (Timeout) 3318 - mpi_dst_example_simple_lap_c_facto2_sched4_kway_tqrcpend (Timeout) 3319 - mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpbegin (Timeout) 3320 - mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_tqrcpend (Timeout) 3321 - mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrrtbegin (Timeout) 3322 - mpi_dst_example_simple_lap_c_facto2_sched4_not_rqrrtend (Timeout) 3323 - mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrrtbegin (Timeout) 3324 - mpi_dst_example_simple_lap_c_facto2_sched4_kway_rqrrtend (Timeout) 3325 - mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtbegin (Timeout) 3326 - mpi_dst_example_simple_lap_c_facto2_sched4_kwayprojections_rqrrtend (Timeout) 3327 - mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpilu0 (Timeout) 3328 - mpi_dst_example_simple_lap_c_facto2_sched4_kway_pqrcpilu1 (Timeout) 3329 - mpi_dst_example_simple_lap_c_facto3_sched4_not_svdbegin (Timeout) 3330 - mpi_dst_example_simple_lap_c_facto3_sched4_not_svdend (Timeout) 3331 - mpi_dst_example_simple_lap_c_facto3_sched4_kway_svdbegin (Timeout) 3332 - mpi_dst_example_simple_lap_c_facto3_sched4_kway_svdend (Timeout) 3333 - mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_svdbegin (Timeout) 3334 - mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_svdend (Timeout) 3335 - mpi_dst_example_simple_lap_c_facto3_sched4_not_pqrcpbegin (Timeout) 3336 - mpi_dst_example_simple_lap_c_facto3_sched4_not_pqrcpend (Timeout) 3337 - mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpbegin (Timeout) 3338 - mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpend (Timeout) 3339 - mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpbegin (Timeout) 3340 - mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_pqrcpend (Timeout) 3341 - mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrcpbegin (Timeout) 3342 - mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrcpend (Timeout) 3343 - mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrcpbegin (Failed) 3344 - mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrcpend (Timeout) 3345 - mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpbegin (Timeout) 3346 - mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrcpend (Timeout) 3347 - mpi_dst_example_simple_lap_c_facto3_sched4_not_tqrcpbegin (Timeout) 3348 - mpi_dst_example_simple_lap_c_facto3_sched4_not_tqrcpend (Timeout) 3349 - mpi_dst_example_simple_lap_c_facto3_sched4_kway_tqrcpbegin (Timeout) 3350 - mpi_dst_example_simple_lap_c_facto3_sched4_kway_tqrcpend (Timeout) 3351 - mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpbegin (Timeout) 3352 - mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_tqrcpend (Timeout) 3353 - mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrrtbegin (Timeout) 3354 - mpi_dst_example_simple_lap_c_facto3_sched4_not_rqrrtend (Timeout) 3355 - mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrrtbegin (Timeout) 3356 - mpi_dst_example_simple_lap_c_facto3_sched4_kway_rqrrtend (Timeout) 3357 - mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtbegin (Timeout) 3358 - mpi_dst_example_simple_lap_c_facto3_sched4_kwayprojections_rqrrtend (Timeout) 3359 - mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpilu0 (Timeout) 3360 - mpi_dst_example_simple_lap_c_facto3_sched4_kway_pqrcpilu1 (Timeout) 3361 - mpi_dst_example_simple_lap_c_facto4_sched4_not_svdbegin (Timeout) 3362 - mpi_dst_example_simple_lap_c_facto4_sched4_not_svdend (Timeout) 3363 - mpi_dst_example_simple_lap_c_facto4_sched4_kway_svdbegin (Timeout) 3364 - mpi_dst_example_simple_lap_c_facto4_sched4_kway_svdend (Timeout) 3365 - mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_svdbegin (Timeout) 3366 - mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_svdend (Timeout) 3367 - mpi_dst_example_simple_lap_c_facto4_sched4_not_pqrcpbegin (Timeout) 3368 - mpi_dst_example_simple_lap_c_facto4_sched4_not_pqrcpend (Timeout) 3369 - mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpbegin (Timeout) 3370 - mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpend (Timeout) 3371 - mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpbegin (Timeout) 3372 - mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_pqrcpend (Timeout) 3373 - mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrcpbegin (Timeout) 3374 - mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrcpend (Timeout) 3375 - mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrcpbegin (Timeout) 3376 - mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrcpend (Timeout) 3377 - mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpbegin (Timeout) 3378 - mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrcpend (Timeout) 3379 - mpi_dst_example_simple_lap_c_facto4_sched4_not_tqrcpbegin (Timeout) 3380 - mpi_dst_example_simple_lap_c_facto4_sched4_not_tqrcpend (Timeout) 3381 - mpi_dst_example_simple_lap_c_facto4_sched4_kway_tqrcpbegin (Timeout) 3382 - mpi_dst_example_simple_lap_c_facto4_sched4_kway_tqrcpend (Timeout) 3383 - mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpbegin (Timeout) 3384 - mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_tqrcpend (Timeout) 3385 - mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrrtbegin (Timeout) 3386 - mpi_dst_example_simple_lap_c_facto4_sched4_not_rqrrtend (Timeout) 3387 - mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrrtbegin (Timeout) 3388 - mpi_dst_example_simple_lap_c_facto4_sched4_kway_rqrrtend (Timeout) 3389 - mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtbegin (Timeout) 3390 - mpi_dst_example_simple_lap_c_facto4_sched4_kwayprojections_rqrrtend (Timeout) 3391 - mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpilu0 (Timeout) 3392 - mpi_dst_example_simple_lap_c_facto4_sched4_kway_pqrcpilu1 (Timeout) 3393 - mpi_dst_example_simple_lap_z_facto0_sched4_not_svdbegin (Timeout) 3394 - mpi_dst_example_simple_lap_z_facto0_sched4_not_svdend (Timeout) 3395 - mpi_dst_example_simple_lap_z_facto0_sched4_kway_svdbegin (Timeout) 3396 - mpi_dst_example_simple_lap_z_facto0_sched4_kway_svdend (Timeout) 3397 - mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_svdbegin (Timeout) 3398 - mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_svdend (Timeout) 3399 - mpi_dst_example_simple_lap_z_facto0_sched4_not_pqrcpbegin (Timeout) 3400 - mpi_dst_example_simple_lap_z_facto0_sched4_not_pqrcpend (Timeout) 3401 - mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpbegin (Timeout) 3402 - mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpend (Timeout) 3403 - mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpbegin (Timeout) 3404 - mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_pqrcpend (Timeout) 3405 - mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrcpbegin (Timeout) 3406 - mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrcpend (Timeout) 3407 - mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrcpbegin (Timeout) 3408 - mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrcpend (Timeout) 3409 - mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpbegin (Timeout) 3410 - mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrcpend (Timeout) 3411 - mpi_dst_example_simple_lap_z_facto0_sched4_not_tqrcpbegin (Timeout) 3412 - mpi_dst_example_simple_lap_z_facto0_sched4_not_tqrcpend (Timeout) 3413 - mpi_dst_example_simple_lap_z_facto0_sched4_kway_tqrcpbegin (Timeout) 3414 - mpi_dst_example_simple_lap_z_facto0_sched4_kway_tqrcpend (Timeout) 3415 - mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpbegin (Timeout) 3416 - mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_tqrcpend (Timeout) 3417 - mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrrtbegin (Timeout) 3418 - mpi_dst_example_simple_lap_z_facto0_sched4_not_rqrrtend (Timeout) 3419 - mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrrtbegin (Timeout) 3420 - mpi_dst_example_simple_lap_z_facto0_sched4_kway_rqrrtend (Timeout) 3421 - mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtbegin (Timeout) 3422 - mpi_dst_example_simple_lap_z_facto0_sched4_kwayprojections_rqrrtend (Timeout) 3423 - mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpilu0 (Timeout) 3424 - mpi_dst_example_simple_lap_z_facto0_sched4_kway_pqrcpilu1 (Timeout) 3425 - mpi_dst_example_simple_lap_z_facto1_sched4_not_svdbegin (Timeout) 3426 - mpi_dst_example_simple_lap_z_facto1_sched4_not_svdend (Timeout) 3427 - mpi_dst_example_simple_lap_z_facto1_sched4_kway_svdbegin (Timeout) 3428 - mpi_dst_example_simple_lap_z_facto1_sched4_kway_svdend (Timeout) 3429 - mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_svdbegin (Timeout) 3430 - mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_svdend (Timeout) 3431 - mpi_dst_example_simple_lap_z_facto1_sched4_not_pqrcpbegin (Timeout) 3432 - mpi_dst_example_simple_lap_z_facto1_sched4_not_pqrcpend (Timeout) 3433 - mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpbegin (Timeout) 3434 - mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpend (Timeout) 3435 - mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpbegin (Timeout) 3436 - mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_pqrcpend (Timeout) 3437 - mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrcpbegin (Timeout) 3438 - mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrcpend (Timeout) 3439 - mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrcpbegin (Timeout) 3440 - mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrcpend (Timeout) 3441 - mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpbegin (Timeout) 3442 - mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrcpend (Timeout) 3443 - mpi_dst_example_simple_lap_z_facto1_sched4_not_tqrcpbegin (Timeout) 3444 - mpi_dst_example_simple_lap_z_facto1_sched4_not_tqrcpend (Timeout) 3445 - mpi_dst_example_simple_lap_z_facto1_sched4_kway_tqrcpbegin (Timeout) 3446 - mpi_dst_example_simple_lap_z_facto1_sched4_kway_tqrcpend (Timeout) 3447 - mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpbegin (Timeout) 3448 - mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_tqrcpend (Timeout) 3449 - mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrrtbegin (Timeout) 3450 - mpi_dst_example_simple_lap_z_facto1_sched4_not_rqrrtend (Timeout) 3451 - mpi_dst_example_simple_lap_z_facto1_sched4_kway_rqrrtbegin (Timeout) 3453 - mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtbegin (Timeout) 3454 - mpi_dst_example_simple_lap_z_facto1_sched4_kwayprojections_rqrrtend (Timeout) 3455 - mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpilu0 (Timeout) 3456 - mpi_dst_example_simple_lap_z_facto1_sched4_kway_pqrcpilu1 (Timeout) 3457 - mpi_dst_example_simple_lap_z_facto2_sched4_not_svdbegin (Timeout) 3458 - mpi_dst_example_simple_lap_z_facto2_sched4_not_svdend (Timeout) 3459 - mpi_dst_example_simple_lap_z_facto2_sched4_kway_svdbegin (Timeout) 3460 - mpi_dst_example_simple_lap_z_facto2_sched4_kway_svdend (Timeout) 3461 - mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_svdbegin (Timeout) 3462 - mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_svdend (Timeout) 3464 - mpi_dst_example_simple_lap_z_facto2_sched4_not_pqrcpend (Timeout) 3465 - mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpbegin (Timeout) 3466 - mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpend (Timeout) 3468 - mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_pqrcpend (Timeout) 3469 - mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrcpbegin (Timeout) 3470 - mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrcpend (Timeout) 3471 - mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrcpbegin (Timeout) 3472 - mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrcpend (Timeout) 3473 - mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpbegin (Timeout) 3474 - mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrcpend (Timeout) 3475 - mpi_dst_example_simple_lap_z_facto2_sched4_not_tqrcpbegin (Timeout) 3476 - mpi_dst_example_simple_lap_z_facto2_sched4_not_tqrcpend (Timeout) 3477 - mpi_dst_example_simple_lap_z_facto2_sched4_kway_tqrcpbegin (Timeout) 3479 - mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_tqrcpbegin (Timeout) 3481 - mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrrtbegin (Timeout) 3482 - mpi_dst_example_simple_lap_z_facto2_sched4_not_rqrrtend (Timeout) 3483 - mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrrtbegin (Timeout) 3484 - mpi_dst_example_simple_lap_z_facto2_sched4_kway_rqrrtend (Timeout) 3485 - mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtbegin (Timeout) 3486 - mpi_dst_example_simple_lap_z_facto2_sched4_kwayprojections_rqrrtend (Timeout) 3487 - mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpilu0 (Timeout) 3488 - mpi_dst_example_simple_lap_z_facto2_sched4_kway_pqrcpilu1 (Timeout) 3489 - mpi_dst_example_simple_lap_z_facto3_sched4_not_svdbegin (Timeout) 3490 - mpi_dst_example_simple_lap_z_facto3_sched4_not_svdend (Timeout) 3491 - mpi_dst_example_simple_lap_z_facto3_sched4_kway_svdbegin (Timeout) 3492 - mpi_dst_example_simple_lap_z_facto3_sched4_kway_svdend (Timeout) 3493 - mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_svdbegin (Timeout) 3494 - mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_svdend (Timeout) 3495 - mpi_dst_example_simple_lap_z_facto3_sched4_not_pqrcpbegin (Timeout) 3497 - mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpbegin (Timeout) 3498 - mpi_dst_example_simple_lap_z_facto3_sched4_kway_pqrcpend (Timeout) 3499 - mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_pqrcpbegin (Timeout) 3501 - mpi_dst_example_simple_lap_z_facto3_sched4_not_rqrcpbegin (Timeout) 3503 - mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrcpbegin (Timeout) 3504 - mpi_dst_example_simple_lap_z_facto3_sched4_kway_rqrcpend (Timeout) 3505 - mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpbegin (Timeout) 3506 - mpi_dst_example_simple_lap_z_facto3_sched4_kwayprojections_rqrcpend (Timeout) 3507 - mpi_dst_example_simple_lap_z_facto3_sched4_not_tqrcpbegin (Timeout) 3509 - mpi_dst_example_simple_lap_z_facto3_sched4_kway_tqrcpbegin (Timeout) 3554 - bcsc_shm_test_bcsc_spmv_tests_lap_d (Timeout) 3557 - bcsc_shm_test_bcsc_spmv_tests_rsa (Timeout) 3558 - bcsc_shm_test_bcsc_spmv_tests_mm (Timeout) 3559 - bcsc_shm_test_bcsc_spmv_tests_hb (Timeout) 3561 - bcsc_shm_test_bcsc_spmv_time_lap_s (Timeout) 3562 - bcsc_shm_test_bcsc_spmv_time_lap_d (Timeout) 3563 - bcsc_shm_test_bcsc_spmv_time_lap_c (Timeout) 3564 - bcsc_shm_test_bcsc_spmv_time_lap_z (Timeout) 3566 - bcsc_shm_test_bcsc_spmv_time_mm (Timeout) 3567 - bcsc_shm_test_bcsc_spmv_time_hb (Timeout) 3569 - bcsc_shm_test_bvec_gemv_tests (Timeout) 3590 - bcsc_mpi_rep_test_bvec_applyorder_tests (Timeout) 3597 - bcsc_mpi_dst_test_bcsc_spmv_tests_hb (Timeout) 3609 - fortran_shm_fsimple (Timeout) 3611 - fortran_shm_flaplacian (Timeout) 3617 - fortran_shm_fusermat_csr (Timeout) Errors while running CTest ==> ERROR: A failure occurred in check().  Aborting... [!p]104\[?7h]3008;end=b16a850594984e6d8ae41fe2a61f8fdf\==> ERROR: Build failed, check /var/lib/archbuild/extra-riscv64/felix-3/build [?25h[?25h[?25hreceiving incremental file list pastix-6.4.0-5-riscv64-build.log pastix-6.4.0-5-riscv64-check.log sent 62 bytes received 654,573 bytes 261,854.00 bytes/sec total size is 16,965,206 speedup is 25.92